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PREFACE 



The intended reader of this book is a graduate student beginning a doctoral pro- 
gram in physics or a closely related subject, who wants to understand the physical 
and mathematical foundations of analytical mechanics and the relation of classical 
mechanics to relativity and quantum theory. 

The book’s distinguishing feature is the introduction of extended Lagrangian and 
Hamiltonian methods that treat time as a transformable coordinate, rather than as the 
universal time parameter of traditional Newtonian physics. This extended theory is 
introduced in Part II, and is used for the more advanced topics such as covariant me- 
chanics, Noether’s theorem, canonical transformations, and Hamilton-Jacobi theory. 

The obvious motivation for this extended approach is its consistency with special 
relativity. Since time is allowed to transform, the Lorentz transformation of special 
relativity becomes a canonical transformation. At the start of the twenty-first century, 
some hundred years after Einstein’s 1905 papers, it is no longer acceptable to use the 
traditional definition of canonical transformation that excludes the Lorentz transfor- 
mation. The book takes the position that special relativity is now a part of standard 
classical mechanics and should be treated integrally with the other, more traditional, 
topics. Chapters are included on special relativistic spacetime, fourvectors, and rela- 
tivistic mechanics in fourvector notation. The extended Lagrangian and Hamiltonian 
methods are used to derive manifestly covariant forms of the Lagrange, Hamilton, 
and Hamilton-Jacobi equations. 

In addition to its consistency with special relativity, the use of time as a coordi- 
nate has great value even in pre-relativistic physics. It could have been adopted in 
the nineteenth century, with mathematical elegance as the rationale. When an ex- 
tended Lagrangian is used, the generalized energy theorem (sometimes called the 
Jacobi-integral theorem), becomes just another Lagrange equation. Noether’s theo- 
rem, which normally requires an longer proof to deal with the intricacies of a varied 
time parameter, becomes a one-line corollary of Hamilton’s principle. The use of ex- 
tended phase space greatly simplifies the definition of canonical transformations. In 
the extended approach (but not in the traditional theory) a transformation is canoni- 
cal if and only if it preserves the Hamilton equations. Canonical transformations can 
thus be characterized as the most general phase-space transformations under which 
the Hamilton equations are form invariant. 

This is also a book for those who study analytical mechanics as a preliminary to 
a critical exploration of quantum mechanics. Comparisons to quantum mechanics ap- 
pear throughout the text, and classical mechanics itself is presented in a way that will 
aid the reader in the study of quantum theory. A chapter is devoted to linear vector 
operators and dyadics, including a comparison to the bra-ket notation of quantum 
mechanics. Rotations are presented using an operator formalism similar to that used 
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in quantum theory, and the definition of the Euler angles follows the quantum me- 
chanical convention. The extended Hamiltonian theory with time as a coordinate is 
compared to Dirac’s formalism of primary phase-space constraints. The chapter on 
relativistic mechanics shows how to use covariant Hamiltonian theory to write the 
Klein-Gordon and Dirac wave functions. The chapter on Hamilton-Jacobi theory in- 
cludes a discussion of the closely related Bohm hidden variable model of quantum 
mechanics. 

The reader is assumed to be familiar with ordinary three-dimensional vectors, 
and to have studied undergraduate mechanics and linear algebra. Familiarity with 
the notation of modern differential geometry is not assumed. In order to appreciate 
the advance that the differential-geometric notation represents, a student should first 
acquire the background knowledge that was taken for granted by those who created 
it. The present book is designed to take the reader up to the point at which the 
methods of differential geometry should properly be introduced — before launching 
into phase-space flow, chaotic motion, and other topics where a geometric language 
is essential. 

Each chapter in the text ends with a set of exercises, some of which extend the 
material in the chapter. The book attempts to maintain a level of mathematical rigor 
sufficient to allow the reader to see clearly the assumptions being made and their 
possible limitations. To assist the reader, arguments in the main body of the text fre- 
quently refer to the mathematical appendices, collected in Part III, that summarize 
various theorems that are essential for mechanics. I have found that even the most 
talented students sometimes lack an adequate mathematical background, particularly 
in linear algebra and many-variable calculus. The mathematical appendices are de- 
signed to refresh the reader’s memory on these topics, and to give pointers to other 
texts where more information may be found. 

This book can be used in the first year of a doctoral physics program to provide a 
necessary bridge from undergraduate mechanics to advanced relativity and quantum 
theory. Unfortunately, such bridge courses are sometimes dropped from the curricu- 
lum and replaced by a brief classical review in the graduate quantum course. The risk 
of this is that students may learn the recipes of quantum mechanics but lack knowl- 
edge of its classical roots. This seems particularly unwise at the moment, since several 
of the current problems in theoretical physics — the development of quantum informa- 
tion technology, and the problem of quantizing the gravitational field, to name two — 
require a fundamental rethinking of the quantum-classical connection. Since progress 
in physics depends on researchers who understand the foundations of theories and 
not just the techniques of their application, it is hoped that this text may encourage 
the retention or restoration of introductory graduate analytical mechanics courses. 



Oliver Davis Johns 
San Francisco, California 
April 2005 
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BASIC DYNAMICS OF POINT PARTICLES AND COLLECTIONS 



Modern mechanics begins with the publication in 1687 of Isaac Newton’s Principia, an 
extension of the work of his predecessors, notably Galileo and Descartes, that allows 
him to explain mathematically what he calls the “System of the World”: the motions of 
planets, moons, comets, tides. The three “Axioms, or Laws of Motion” in the Principia 
(Newton, 1729) are: 

Law I: Every body perseveres in its state of rest, or of uniform motion in 
a right line, unless it is compelled to change that state by forces impressed 
thereon. 

Law II: The alteration of motion is ever proportional to the motive force 
impressed; and is made in the direction of the right line in which that force 
is impressed. 

Law III: To every Action there is always opposed an equal Reaction: or the 
mutual actions of two bodies upon each other are always equal, and directed 
to contrary parts. 

These axioms refer to the general behavior of a “body.” It is clear from Newton’s 
examples (projectiles, a top, planets, comets, a stone) in the same section that he 
intends these bodies to be macroscopic, ordinary objects. 

But elsewhere Newton (1730) refers to the “particles of bodies” in ways that sug- 
gest an atomic theory in which the primitive, elementary objects are small, indestruc- 
tible, “solid, massy, hard, impenetrable, movable particles.” These are what we will 
call the point particles of Newtonian physics. Newton says of them that, “these prim- 
itive Particles being Solids, are incomparably harder than any porous Bodies com- 
pounded of them; even so very hard as never to wear or break in pieces; no ordinary 
Power being able to divide what God himself made one in the first Creation.” 

The present chapter will begin with the assumption that Newton’s three axioms 
refer fundamentally to these point particles. After deriving the laws of momentum, 
angular momentum, and work-energy for point particles, we will show that, given 
certain plausible and universally accepted additional axioms, essentially the same 
laws can be proved to apply to macroscopic bodies, considered as collections of the 
elementary point particles. 

1.1 Newton’s Space and Time 

Before discussing the laws of motion of point masses, we must consider the space and 
time in which that motion takes place. For Newton, space was logically and physically 
distinct from the masses that might occupy it. Space provided a static, absolute, and 
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independent reference with respect to which all particle positions and motions were 
to be measured. Space could be perceived by looking at the fixed stars which were 
presumed to be at rest relative to it. Newton also emphasized the ubiquity of space, 
comparing it to the sensorium of God. 1 

Newton thought of time geometrically, comparing it to a mathematical point mov- 
ing steadily along a straight line. As with space, the even flow of time was absolute 
and independent of objects. He writes in the Principia, “Absolute, true and mathe- 
matical time, of itself, and from its own nature, flows equably without relation to 
anything external.” 2 

In postulating an absolute space, Newton was breaking with Descartes, who held 
that the proper definition of motion was motion with respect to nearby objects. In 
the Principia, Newton uses the example of a spinning bucket filled with water to 
argue for absolute motion. If the bucket is suspended by a rope from a tree limb and 
then twisted, upon release the bucket will initially spin rapidly but the water will 
remain at rest. One observes that the surface of the water remains flat. Later, when 
the water has begun to rotate with the bucket, the surface of the water will now be 
concave, in response to the forces required to maintain its accelerated circular motion. 
If motion were to be measured with respect to proximate objects, one would expect 
the opposite observations. Initially, there is a large relative motion between the water 
and the proximate bucket, and later the two have nearly zero relative motion. So the 
Cartesian view would predict inertial effects initially, with the water surface becoming 
flat later, contrary to observation. 

Newton realized that, as a practical matter, motion would often be measured by 
reference to objects rather than to absolute space directly. As we discuss in Section 
14.1, the Galilean relativity principle states that Newton’s laws hold when position 
is measured with respect to inertial systems that are either at rest, or moving with 
constant velocity, relative to absolute space. But Newton considered these relative 
standards to be secondary, merely stand-ins for space. 

Nearly the opposite view was held by Newton’s great opponent, Leibniz, who held 
that space is a “mere seeming thing” and that the only reality is the relation of objects. 
Their debate took the form of an exchange of letters, later published, between Leibniz 
and Clarke, Newton’s surrogate. 3 Every student is urged to read them. The main diffi- 
culty for the modern reader is the abundance of theological arguments, mixed almost 
inextricably with the physical ones. One can appreciate the enormous progress that 
has been made since the seventeenth century in freeing physics from the constraints 
of theology. In the century after Newton and Leibniz, their two philosophical tradi- 
tions continued to compete. But the success of the Newtonian method in explaining 

1 Seventeenth century physiology held that the information from human sense organs is collected in a 
“sensorium” which the soul then views. 

-Newton’s ideas about time were possibly influenced by those of his predecessor at Cambridge, Isaac 
Barrow. See Chapter 9 of Whitrow (1989). 

3 The correspondence is reprinted, with portions of Newton’s writings, in Alexander (1956). 
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experiments and phenomena led to its gradual ascendency . 4 

Newton’s space and time were challenged by Mach in the late nineteenth century. 
Mach argued, like Leibniz, that absolute space and time are illusory and that the only 
reality is the relation of objects . 5 Mach also proposed that the inertia of a particle 
is related to the existence of other particles and presumably would vanish without 
them, an idea that Einstein referred to as Mach’s Principle. 

Einstein’s special relativity unifies space and time. And in his general relativity the 
metric of the combined spacetime becomes dynamic rather than static and absolute. 
General relativity is Machian in the sense that the masses of the universe affect the 
local curvature of spacetime, but Newtonian in the sense that spacetime itself (now 
represented by the dynamic metric field) is something all pervasive that has definite 
properties even at points containing no masses. 

For the remainder of Part I of the book, we will adopt the traditional Newto- 
nian definition of space and time. In Part II, we will consider the modifications of 
Lagrangian and Hamiltonian mechanics that are needed to accommodate special rel- 
ativity, in which space and time are combined and time becomes a transformable 
coordinate. 

1.2 Single Point Particle 

In this section, we assume the applicability of Newton’s laws to point particles, and 
introduce the basic derived quantities: momentum, angular momentum, work, kinetic 
energy, and their relations. 

An uncharged point particle is characterized completely by its mass m and its 
position r relative to the origin of some inertial system of coordinates. The velocity 
v = dr/dt and acceleration a = d\/dt are derived by successive differentiation. Its 
momentum (which is what Newton called “motion” in his second law) is defined as 

P = 777V ( 1 - 1 ) 

Newton’s second law then can be expressed as the law of momentum for point parti- 
cles, 




Since the mass of a point particle is unchanging, this is equivalent to the more familiar 
f=???a. The requirement that the change of momentum is “in the direction of the right 

4 Leibnizian ideas continued to be influential, however. The great eighteenth century mathematician 
Euler, to whom our subject owes so much, published in 1768 a widely read book, Letters Addressed to a 
German Princess, in which he explained the science of his day to the lay person (Euler, 1823). He felt 
it necessary to devote some thirty pages of that book to refute Wolff, the chief proponent of Leibniz’s 
philosophy. See also the detailed defense of Newton’s ideas in Euler, L. (1748) “Reflexions sur l’Espace et 
le Terns,” Memoires de VAcademie des Sciences de Berlin, reprinted in Series III, Volume 2 of Euler (1911). 

5 See Mach (1907). Discussions of Mach’s ideas are found in Rindler (1977, 2001) and Misner, Thorne 
and Wheeler (1973). A review of the history of spacetime theories from a Machian perspective is found in 
Barbour (1989, 2001). See also Barbour and Pfister (1995). 
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line” of the impressed force f is guaranteed in modern notation by the use of vector 
quantities in the equations. 

For the point particles, Newton’s first law follows directly from eqn (1.2). When 
f = 0, the time derivative of p is zero and so p is a constant vector. Note that eqn 
(1.2) is a vector relation. If, for example, the ^-component of force f x is zero, then 
the corresponding momentum component p x will be constant regardless of what the 
other components may do. 

The angular momentum j of a point particle and the torque x acting on it are 
defined, respectively, as 

j = rxp x = r x f (1.3) 

It follows that the law of angular momentum for point particles is 



x 



hi 

dt 



(1.4) 



since 



— = — xp + rx — = v x m\ + rxf=0 + x (1.5) 

dt dt F dt 

In a time dt the particle moves a vector distance dr — \ dt. The work dW done by 
force f in this time is defined as 

dW=f-dr (1.6) 

This work is equal to the increment of the quantity (l/2);?nr since 



dW — f ■ \ dt — 



d{mv) 

dt 




■ v = m (d\) ■ v = d 




(1.7) 



Taking a particle at rest to have zero kinetic energy, we define the kinetic energy T as 



T = 




( 1 . 8 ) 



with the result that a work-energy theorem for point particles may be expressed as 
dW = dT or 



f 



dT 
■ v = — 
dt 



(1.9) 



If the force f is either zero or constantly perpendicular to v (as is the case for purely 
magnetic forces on a charged particle, for example) then the left side of eqn (1.9) will 
vanish and the kinetic energy T will be constant. 



1.3 Collective Variables 

Now imagine a collection of N point particles labeled by index n, with masses m \ , 
m 2 , . . . , m-N and positions ri,r 2 , r#. 

The other quantities defined in Section 1.2 will be indexed similarly, with p„ = 
m„\ n , for example, referring to the momentum of the nth particle and f„ denoting the 
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Fig. 1.1. A collection of point masses. 

force acting on it. The total mass, momentum, force, angular momentum, torque, and 
kinetic energy of this collection may be defined by 

N N N N N N 

M = J2 m n P=^p„ F=^f„ J=^j„ t = £t„ T = '^ / T n 

n = 1 n = 1 n= 1 n = 1 n= 1 n = 1 

( 1 . 10 ) 

Note that, in the cases of P, F, J, and t, these are vector sums. If a particular collection 
consisted of two identical particles moving at equal speeds in opposite directions, for 
example, P would be zero. 

In the following sections, we derive the equations of motion for these collective 
variables. All of the equations of Section 1.2 are assumed to hold individually for each 
particle in the collection, with the obvious addition of subscripts n to each quantity to 
label the particular particle being considered. For example, v„ = dr„/dt, a„ = d\ n /dt, 
P/i ~ tn n '/i j = - dp„ /dt, i‘ n — fn n n n , etc. 



1.4 The Law of Momentum for Collections 

We begin with the law of momentum. Differentiation of the sum for P in eqn (1.10), 
using eqn (1.2) in the indexed form dp n /dt = f„, gives 



dV 

dt 



, N N , 

Cl ^ ^ > Cl P/7 

Lp» = L 



dt 



n = 1 



n = 1 



dt 



N 

E f " = F 



n= 1 



(l.ii) 



The time rate of change of the total momentum is thus the total force. 

But the force f„ on the ;;th particle may be examined in more detail. Suppose that 
it can be written as the vector sum of an external force f}, ext) coming from influences 
operating on the collection from outside it, and an internal force consisting of all 
forces that cannot be identified as external, such as forces on particle n coming from 
collision or other interaction with other particles in the collection. For example, if the 
collection were a globular cluster of stars (idealized here as point particles!) orbiting 
a galactic center, the external force on star n would be the gravitational attraction 
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from the galaxy, and the internal force would be the gravitational attraction of the 
other stars in the cluster. Thus 



f n = fjf xt) + f^ mt) and, correspondingly, F = F (ext) + F (mt) (1.12) 



where 



N N 

F (ext) = lf xt > and F (int) = 4 iM) 
n = 1 n=\ 



(1.13) 



Axiom 1.4.1: The Law of Momentum 

It is taken as an axiom in all branches of modern physics that, insofar as the action of 
outside influences can be represented by forces, the following Law of Momentum must 
hold: 

uv 

F (ext> = — (1.14) 

dt 



It follows from this Law and eqn (1.11) that F = F (ext) and hence F (int) = 0. Identify- 
ing P with Newton’s “motion” of a body, and F (ext) with his “motive force impressed” 
on it, eqn (1.14) simply restates Newton’s second law for bodies, now considered as 
collections of point particles. 

An immediate consequence of the Law of Momentum is that the vanishing of 
F (ext) makes P constant. We then say that P is conserved. This rule of momentum 
conservation is generally believed to apply even for those situations that cannot be 
described correctly by the concept of force. This is the essential content of Newton’s 
first law. The total momentum of an isolated body does not change. 



1.5 The Law of Angular Momentum for Collections 



The derivation of the Law of Angular Momentum is similar to the previous Section 
1.4. Differentiation of the sum for J in eqn (1.10), using eqn (1.4) in the indexed form 
din/dt = x„, gives 



dj 

dt 



d A. 



n = 1 



N 



E 



djn 

dt 



N 

E t » = t 

n = 1 



(1.15) 



The time rate of change of the total angular momentum is thus the total torque. 

Making the same division of forces into external and internal as was done in Sec- 
tion 1.4, we use the indexed form of eqn (1.3) to write the torque on particle n as the 
sum of external and internal torques, 



x„ = r„ x f„ = r„ x (fjf xt) + f, ( , int) ) 



— T ( ext ) i T (int) 
i, n ' 



where 

T ( ext > = r„ x f[ ( ext) and x< int) = r„ x f< int) 

Then, the total torque x defined in eqn (1.10) may then be written 

(eirt) , " 



(1.16) 

(1.17) 



T = T 



(int) 



(1.18) 
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where 



N N 

T (eXt) = E T 7 ( , ext) and x (int) = E ^ n 

n=l n= 1 



(1.19) 



Axiom 1.5.1: The Law of Angular Momentum 

It is taken as an axiom in all branches of modern physics that, insofar as the action of 
outside influences can be represented by forces, the following Law of Angular Momentum 
must hold: 



X (ext) = 



dj 

dt 



( 1 . 20 ) 



It follows from this Law and eqn (1.15) that t = x (ext) and hence T lmt) = 0. An im- 
mediate consequence of the Law of Angular Momentum is that the vanishing of x (extl 
makes J constant. We then say that J is conserved. This rule of angular momentum 
conservation is generally believed to apply even for those situations that cannot be de- 
scribed correctly by the concept of force. The total angular momentum of an isolated 
body does not change. 

It is important to notice that the Laws of Momentum and Angular Momentum 
are vector relations. For example, in eqn (1.14), if F { f m = 0 then P y is conserved 
regardless of the values of the other components of the total external force. A similar 
separation of components holds also in eqn (1.20). 



1.6 “Derivations” of the Axioms 

Although the Law of Momentum is an axiom, it can actually be “derived” if one ac- 
cepts an outdated action-at-a-distance model of internal forces in which the force f}, int) 
is taken as the instantaneous vector sum of forces on particle n coming from all of the 
other particles in the collection. Denote the force on particle n coming from particle 
n' as f nn i and thus write 



N N N 

f » nt) = E f "»' and hence p(int> = E E f "«' ( L21) 

n'= 1 n= 1 n'= 1 

n f n! ^ n 

In this model, Newton’s third law applied to the point particles implies that 

Inn’ - ~t n'n ( 1 - 22 ) 

which makes the symmetric double sum in eqn (1.21) vanish identically With F (int) = 
0, eqns(l.ll, 1.12) then imply eqn (1.14), as was to be proved. Equation (1.22) is 
sometimes referred to as the weak form of Newton’s third law. We emphasize, however, 
that the Law of Momentum is more general than the action-at-a-distance model of the 
internal forces used in this derivation. 

The Law of Angular Momentum is also an axiom but, just as in the case of linear 
momentum, it too can be “derived” from an outdated action-at-a-distance model of 
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internal forces. We again denote the force on particle n coming from particle n! as f„„/ 
and thus write 

N N N 

= r„ x f< mt) = r » x f„„' and hence x (mt) = ^ ^ r„ x f„„/ (1.23) 

n '= 1 tt = l n'=l 

It follows from eqn (1.22) that the second of eqn (1.23) may be rewritten as 

1 N N 

x(mt) = 9 e e ( r " - r ' !,) x f ""' ^- 24) 

«=1 n '= 1 
n'^n 

If we now assume (which we did not need to assume in the linear momentum case) 
that the force f is central, that is parallel (or anti-parallel) to the line (r„ — ry) 
between particles n and n', then it follows from the vanishing of the cross products 
that t (int) is zero, as was to be proved. 

The addition of centrality to eqn (1.22) is sometimes called the strong form of 
Newton’s third law. We emphasize that, as in the case of linear momentum, the Law of 
Angular Momentum is more general than the model of central, action-at-a-distance 
internal forces used in this last derivation. 

For example, the laws of momentum and angular momentum can be applied cor- 
rectly to the behavior of physical objects such as quartz spheres, whose internal struc- 
ture requires modern solid-state physics for its description rather than Newtonian 
central forces between point masses. Yet, when there are identifiable external force 
fields acting, such as gravity for example, these objects will obey Axioms 1.4.1 and 
1.5.1. 



1.7 The Work-Energy Theorem for Collections 

The work-energy theorem of eqn (1.9) can be extended to collections. Using the def- 
inition in eqn (1.10) together with the indexed form of eqn (1.8), the total kinetic 
energy is 

N ^ N 

T = ^2 T„ = ~Y^ m nVn with v l = v « • v « ( L25 ) 

n = 1 n = 1 

Then the time rate of change of T is equal to the rate at which work is done on all 
particles of the collection, 

, T N 

— = ^f„-v„ (1.26) 

n=\ 

To prove this result, differentiate the sum for T in eqn (1.10), using eqn (1.9) in its 
indexed form dT n /dt — f„ • v„ where v„ = dr n /dt. Then 



dT 

dt 



A N 
Cl v — \ 

trX 1 



n = 1 



N 



E 



dT n 

dt 



N 

'y ' f n * V'. ; 
n= 1 



(1.27) 



as was to be proved. 
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There is little benefit to introducing the separation of force f„ into external and 
internal terms here, since the total kinetic energy T can be changed even when no 
external forces are present. For example, consider four identical particles initially at 
rest at the four corners of a plane square. If there is a gravitational internal force 
among those particles, they will begin to collapse toward the center of the square. 
Thus T will increase even though only internal forces are acting. 



1.8 Potential and Total Energy for Collections 

In some cases, there will exist a potential function U = t/(rj , . . . , r^, f) from which 
all forces on all particles can be derived. Thus 

f „ = — V„t/(ri,r 2 , . ,.,r N ,t) = ~—U( ri,r 2 , ... , r N , t) (1.28) 

9r„ 



where 6 



V„ 



9 

9r„ 




9 

dx n i 



(1.29) 



and x„i is the ;' th coordinate of the nth particle of the collection, that is, r„ = 

E 3 /v 

7=1 x ni e ( • 

The total energy E is defined as E — T + U , where T is the total kinetic energy. 
Its rate of change is 



dE 9 1/ (r i , r 2 , . . . , r#, t) 

dt dt 



(1.30) 



To see this, use the chain rule of partial differentiation and eqns (1.27, 1.28) to write 



dT 

dt 



N 

^ ' f/7 * ' /; 

n= 1 



N 






dU (r 1; r 2 , . . . , r#, t) 

9r„ 




dU (r x , r 2 , ...,r N ,t) 



dt 



(1.31) 



where the last equality implies eqn (1.30). 

If the potential function U = t/(ri, r 2 , . . . , r#, t) happens not to depend explicitly 
on the time t, the partial derivative in eqn (1.30) will vanish and E will be a constant. 
The total energy of the collection is then said to be conserved. 



1.9 The Center of Mass 

All of the collective variables in eqn (1.10) are simple scalar or vector sums of indi- 
vidual quantities. The center of mass of the collection R is only slightly more compli- 
cated. It is defined as the mass-weighted average position of the particles making up 

6 See Section A.ll for a discussion of the notation 3C//3r„, including cautions about its proper use. 
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the collection, 

1 N 

R= ^E m " r " (1.32) 

V ‘ n= 1 

This R can be used to define a new set of position vectors p„ for the point particles, 
called relative position vectors, that give the positions of masses relative to the center 
of mass, rather than relative to the origin of coordinates as the r„ do. 




Fig. 1.2. Center of mass and relative position vectors. The center of mass is at C. 



The definition is 



p„ = r„ — R or, equivalently, r„ = R+ p„ (1.33) 



The vector p„ can be thought of as the position of particle n as seen by an observer 
standing at the center of mass. The vectors p„ can be expanded in terms of Cartesian 
unit vectors e, as 

3 

Pn = (1-34) 

i=l 



Component p m - will be called the /th relative coordinate of particle n. 

The velocity of the center of mass Y is obtained by differentiating eqn (1.32) with 
respect to the time, 



V = 



d R 

dt 



1 

M 



N 

^ ' nhi V n 
n = 1 



(1.35) 



Then, differentiation of eqn (1.33) yields 



p„ = v„ — Y or, equivalently, v„ = V + p„ (1.36) 

where the definition p„ = dp n /dt is used. This quantity will be called the relative 
velocity of mass m n . It may be thought of as the apparent velocity of m n as seen 
by an observer riding on the center of mass. A particular p„ may in some cases be 
nonzero even when v„ = 0 and the mass m n is at rest relative to absolute space, due 
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to the motion of the center of mass induced by motions of the other particles in the 
collection. Differentiating eqn(1.34) gives 

3 

P„=E« (1-37) 

i=i 

where the p m - will be called the ith relative velocity coordinate of mass m n . 

An observer standing at the center of mass will calculate the center of mass to be 
at his feet, at zero distance from him, as is shown in the following lemma which will 
be used in the later proofs. 

Lemma 1.9.1: Properties of Relative Vectors 
A very useful property of vectors p„ and p n is 

N N 

0 = ^m„p„ and 0 — ^^m n p n (1.38) 

n= 1 n= 1 

Proof: The proof of the first expression follows directly from the definitions in eqns 
(1.32, 1.33), 

N N N N 

Y2 m n p„ = ^2 m n (r„ - R) = ^ m n r n - ^ m „ R = MR - MR = 0 (1.39) 

n= 1 n = 1 n= 1 n = 1 

with the second expression following from time differentiation of the first one. □ 

1.10 Center of Mass and Momentum 

Having defined the center of mass, we now can write various collective quantities in 
terms of the vectors R, p and their derivatives. The total momentum P introduced in 
eqn (1.10) can be expressed in terms of the total mass M and velocity of the center of 
mass V by the remarkably simple equation 

P = MV (1.40) 

To demonstrate this result, we use the second of eqn (1.36) to rewrite P as 

N N N N N 

P = X!?" = T>» V » = ( V + Pn) = + Y2 m,r P n = MV (1-41) 

n= 1 n = 1 n = 1 n= 1 n= 1 

where the Lemma 1.9.1 was used to get the last equality. The total momentum of a 
collection of particles is the same as would be produced by a single particle of mass 
M moving with the center of mass velocity V. 

The Law of Momentum in eqn (1.14) can then be written, using eqn (1.40) and 
the constancy of M, as 

rip rfV 

p(ext) _ — _ where A = — (1.42) 

dt dt 

is the acceleration of the center of mass. Thus, beginning from the assumption that 
f = m a for individual point particles, we have demonstrated that F (ext) = MA for 
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composite bodies, provided that A is defined precisely as the acceleration of the center 
of mass of the body. This last result is very close to Newton’s original second law. 



1.11 Center of Mass and Angular Momentum 

The total angular momentum J can also be rewritten in terms of center of mass and 
relative quantities. It is 

J = L + S (1.43) 



L = R x P and S = ^ p„ x (m„p„) (1.44) 

n = 1 

will be referred to as the “orbital” and “spin” contributions to J, respectively. Note 
that L is just the angular momentum that would be produced by a single particle 
of mass M moving with the center of mass, and that S is just the apparent angular 
momentum that would be calculated by an observer standing on the center of mass 
and using only quantities relative to herself. 

To demonstrate this result, we begin with eqn (1.10) and the indexed form of eqn 
(1.3) to write 



N N N N 

j = Y* n = Y (r " x P") = Y (r " x m « v «) = Y m, ‘ ^ r " x v ") (i-45) 

n = 1 n = 1 n = 1 n= 1 



Now we introduce the definitions in eqns (1.33, 1.36), and use the linearity of cross 
products to get 



N 

J = Y m > 1 (R+ Pn) X (V+ p„) (1.46) 

n= 1 

N N N N 

= J>„RxV+£ m n R X P„ + Y m nPn X V+ Y m nPn x Pn 

n= 1 n= 1 n= 1 n = 1 




where, in each term in curly brackets, quantities not depending on index n have been 
factored out of the sum. Lemma 1.9.1 now shows that the second and third terms 
vanish identically. The remaining two terms are identical to the L and S defined in 
eqn (1.44), as was to be proved. 
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1.12 Center of Mass and Torque 

The Law of Angular Momentum, eqn (1.20), contains the total external torque T (ext) . 
Using eqns (1.17, 1.18), it maybe written 

N N 

t (ext) = E T « ext) = E r « 

n= 1 n= 1 

Substituting eqn (1.33) for r„ then gives 

x (ext) = E( R+p ») x ^ ext) = R >< E ^ ext) + E 

n= 1 n = 1 n= 1 

where we have defined the “orbital” and “spin” external torques as 

x^ ext) = R x F (ext) and tf xt) = E P>, x ^ (i.49) 

n= 1 

In a pattern that is becoming familiar, T„ ext) is the torque that would result if the total 
external force on the collection acted on a particle at the center of mass, and ts ext) is 
the external torque on the collection that would be calculated by an observer standing 
at the center of mass and using p„ instead of r„ as the moment arm. 



x f' ext) (1.47) 

P|I x r i) = + T< ext> (1-48) 



1.13 Change of Angular Momentum 

The Law of Angular Momentum in eqn (1.20) may now be broken down into separate 
parts, one for the orbital angular momentum L and the other for the spin angular 
momentum S. The rate of change of L is equal to the orbital external torque, 



d L 

dt 



(ext) 

T o 



(1.50) 



The demonstration is almost identical to that in Section 1.2 for the angular momen- 
tum of a single point particle, 

d L d d R d P / PVt \ 

— = -(RxP) = — xP + Rx — = V x MV + R x F (ext) = 0 + Tj, ext) (1.51) 

dt dt dt dt 

where eqns (1.40, 1.49) and the Law of Momentum, eqn (1.14), have been used. The 

rate of change of S is equal to the spin external torque, 

dS 



dt 



(ext) 

T s 



(1.52) 



The demonstration begins by using eqns (1.43, 1.48) to rewrite eqn (1.20) in the form 



(ext) 



(ext) 



d L d S 



+ ts' = + — 

dt dt 



(1.53) 



Equation (1.50) can then be used to cancel dL/dt with To Xtl . Equating the remaining 
terms then gives eqn (1.52), as was to be shown. 
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Thus eqns (1.50, 1.52) give a separation of the Law of Angular Momentum into 
separate orbital and spin laws. The orbital angular momentum L and the orbital 
torque to extl are exactly what would be produced if all of the mass of the collec- 
tion were concentrated into a point particle at the center of mass. The evolution of 
the orbital angular momentum defined by eqn(1.50) is totally independent of the 
fact that the collection may or may not be spinning about the center of mass. 

Equation (1.52), on the other hand, shows that the time evolution of the spin an- 
gular momentum S is determined entirely by the external torque Ty extl measured by 
an observer standing at the center of mass, and is unaffected by the possible acceler- 
ation of the center of mass that may or may not be happening simultaneously. 

1.14 Center of Mass and the Work-Energy Theorems 

The total kinetic energy T may be expanded in the same way as the total angular 
momentum J in Section 1.13. We may use T n = m n v^J2 and v^ t = v„ • v„ to rewrite 
eqn (1.10), and then use eqn (1.36) to get 

N j N i N 

T = X! T " = 2 ^ ' v ” = 9 ( v + P») • ( V + P n) (i.54) 

n= 1 n= 1 n = 1 

Expanding the dot product and using Lemma 1.9.1 then gives 

T = T o + 7i (1.55) 



where 

1 1 N 

T 0 = - MV 2 and 7) = ||p„|| 2 (1.56) 

n= 1 

are the orbital and internal kinetic energies, respectively. The time rate of change of 
the orbital kinetic energy is 

— = F (ext) • V (1.57) 

dt 

The demonstration uses eqn (1.40) and the Law of Momentum eqn (1.14), 



dT 0 

dt 



d 

dt 



P P 

2 M 



P 

M 



_ = y . p( ext ) 
dt 



(1.58) 



as was to be shown. 

The time rate of change of the internal kinetic energy 7) is 



dT\ 

dt 



N 

y ] ffl ’ Pn 
72 = 1 



(1.59) 



The demonstration of eqn (1.59) is quite similar to that of eqn (1.52). We begin with 
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the collective work-energy theorem, eqn (1.27), rewritten using eqns (1.36, 1.55) as 
^ f • (V+P„) = V • E t, + Et, • P„ (1-60) 

n = 1 «=1 «=1 

The earlier result in Section 1.4 that F = F (extl then gives 

«=1 

Using eqn (1.57) to cancel the first terms on each side gives eqn (1.59), as was to be 
shown. Note the absence of the superscript “(ext)” on f„ in eqn (1.59). This is not a 
mistake! The internal kinetic energy 7) can be changed by both external and internal 
forces, as we noted in Section 1.7. 

1.15 Center of Mass as a Point Particle 

It is remarkable that the center-of-mass motion of a body or other collection of point 
particles can be solved by imagining that the entire mass of the collection is a point 
particle at the center of mass R with the entire external force F lext) acting on that 
single point. The quantities and relations derived above, 

P = MY F (ext) = — L = R x P — = 4 ext) = R x F (ext) (1.62) 
dt dt 

and 

T 0 = -MV 2 — = F (ext) • V (1.63) 

2 dt 

refer only to the total mass M, the center of mass R, its derivative V, and the total 
force F lextl . And yet these formulas replicate all of the results obtained in Section 1.2 
for a single point particle. 

If, as we have assumed, Newton’s laws apply fundamentally to Newtonian point 
particles, then these quantities and relations vindicate Newton’s application of them 
to “bodies” rather than point particles. A billiard ball (by which we mean the center 
of a billiard ball) moves according to the same laws as a single point particle of the 
same mass. 

1.16 Special Results for Rigid Bodies 

The results obtained up to this point apply to all collections, whether they be solid 
bodies or a diffuse gas of point particles. Now we consider special, idealized collec- 
tions called rigid bodies. They are defined by the condition that the distance ||r„ — r„/|| 
between any two masses in the collection is constrained to be constant. In Chapter 8 
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on the kinematics of rigid-body motion, we will prove that this constraint implies the 
existence of a (generally time-varying) vector « and the relation given in eqn (8.93), 

P >, = « x p„ (1.64) 

between each relative velocity vector and the corresponding relative location vector. 
This relation has a number of interesting applications which we will discuss in later 
chapters. Here we point out one of them, the effect on eqn (1.59). Rewriting that 
equation and using eqn (1.64) gives 

dT N N N 

— — ^2 f„ • W X p„ = « • ^2 Pn x f„ = <*> • ^2 - R ) X f„ = w • (t - R X F) 

n = 1 n = 1 n= 1 

(1.65) 

where eqns (1.33, 1.10) have been used. But the Law of Momentum of Section 1.4 
and the Law of Angular Momentum of Section 1.5 imply that 

N N 

F = F (ext) = f » eXt) and x = x (ext) = £ r„ x f< ext) (1.66) 

n = 1 n = 1 



and hence that 

^ = to • (x <ext) - R x F (ext) ) (1.67) 

depends only on the external forces fj, ext) . Thus, for rigid bodies and only for rigid 
bodies, we may add an “(ext)” to eqn (1.59) and write 

HT N 

Rigid bodies only : —jj = ^ fj ( ext) • p„ (1.68) 

n = 1 

It follows from eqns (1.55, 1.57, 1.68) that dT /dt for rigid bodies also depends only 
on external forces, and so we may write eqn (1.27) in the form 

dT N 

Rigid bodies only : — = ^ f^ ext) • v„ . (1.69) 



1.17 Exercises 

Exercise 1.1 In spherical polar coordinates, the radius vector is r = r r. 

(a) Use the product and chain rules of differentiation, and the partial derivatives read from 
eqns (A. 48 - A. 51), to obtain the standard expression for v = dr/dt as 

v = rr + rO 0 + r sind 4> $ (1.70) 

(b) By a similar process, derive the expression for a = d\/dt in terms of 
?, 0. (j>, r, 9, (p, r, 6 , cp, r, 9, (p. 
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Exercise 1.2 Derive the identities in eqns (A. 69, A.71) and demonstrate that eqn (A. 72) does 
follow from them. 

Exercise 1.3 Consider a circular helix defined by 

r = a cos/? ei + a sin/J £2 + C/Se3 (1.71) 

where a, c are given constants, and parameter /I increases monotonically along the curve. 

^ /V ^ 

(a) Express the Serret-Frenet unit vectors t, n, b, the curvature p, and the torsion k , in terms 
of a, c, /3 , ei, £2, £3. 

(b) Show that n is always parallel to the x-y plane. 

Exercise 1.4 In Section A. 12 it is stated that the Serret-Frenet relations eqns (A. 77, A. 78, 
A. 79) may be written as shown in eqn (A. 80), 

dt ~ dn „ db 

— = (oxt — = (oxn — =wxb (1-72) 

ds ds ds 

where go = k t + p b. Verify these formulas. 

Exercise 1.5 A one tonne (1000 kg) spacecraft, in interstellar space far from large masses, 
explodes into three pieces. At the instant of the explosion, the spacecraft was at the origin of 
some inertial system of coordinates and had a velocity of 30 km/sec in the +x direction rel- 
ative to it. Precisely 10 sec after the explosion, two of the pieces are located simultaneously. 
They are a 300 kg piece at coordinates (400, 50, —20) km and a 500 kg piece at coordinates 
(240, 10, 32) km. 

(a) Where was the third piece 10 sec after the explosion? 

(b) Mission control wants to know where the missing piece will be 1 hour after the explo- 
sion. Give them a best estimate and an error circle. (Assume that the spacecraft had a largest 
dimension of 10 m, so that, at worst, a given piece might have come from a point 10 m from 
the center.) 

(c) What if the spacecraft had been spinning end-over-end just before it exploded. Would the 
above answers change? At all? Appreciably? Explain. 




Fig. 1.3. Illustration for Exercise 1.6. 



Exercise 1.6 Three equal point masses m\ = m 2 — m3 — m are attached to a rigid, massless 
rod of total length 2b. Masses #1 and #3 are at the ends of the rod and #2 is in the middle. 
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Mass m i is suspended from a frictionless pivot at the origin of an inertial coordinate system. 
Assume that the motion is constrained in a frictionless manner so that the masses all stay in 
the x-y plane. Let a uniform gravitational field g = gei act in the positive x -direction. 

(a) Using plane polar coordinates, letting the /--direction be along the stick and letting cp be 
the angle between the stick and the x-axis, use the law of angular momentum to obtain <j> and 
(jr as functions of <fi. 

(b) From the above, obtain drx^/dt 1 as a function of </>, f, <)> and use 

ff l) = - m 3g (i-73) 

to obtain the internal force f^ mt) on mass 7/23. 

(c) If it is entirely due to central forces from m\ and m 2 as is required by the “strong form” 
of the second law, then should be parallel to the stick. Is it? Explain. 7 

Exercise 1.7 Show clearly how eqns (1.55, 1.56) follow from eqn (1.54). 




Fig. 1.4. Illustration for Exercise 1.8. 

Exercise 1.8 A hollow, right-circular cylinder of mass M and radius a rolls without slipping 
straight down an inclined plane of angle a, starting from rest. Assume a uniform gravitational 
field g = —ge 2 acting downwards. 

(a) After the center of mass of the cylinder has fallen a distance h , what are the vector values 
of Y, P, S for the cylinder? [Note: This question should be answered without considering the 
details of the forces acting. Assume that rolling without slipping conserves energy.] 

(b) Using your results in part (a), find the force F^ ext) and spin torque T j Cxt) acting on the 
cylinder. 

Exercise 1.9 Write out eqn (A. 67) and verify that it does express the correct chain rule result 
for df/dt. 

Exercise 1.10 If all external forces fj, exl1 on the point masses of a rigid body are derived from 
an external potential L' <ext )(r l , . . . , r p, t), show that the quantity E — T + U (exl) obeys 

d_E_ = dU^(r u r 2 ,...,r D ,t) 

dt dt C ' 

7 See Stadler, W. (1982) “Inadequacy of the Usual Newtonian Formulation for Certain Problems in Particle Me- 
chanics,” Am. J. Phys. 50, p. 595. 
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Exercise 1.11 Let a collection of point masses m \ jnj, . . . . m v move without interaction in 
a uniform, external gravitational field g so that f„ = f{, ext) = m n g. 

(a) Demonstrate that a possible potential for this field is 

N 

U (ri,r 2 , ...,r N ,t) = -^m„r n g (1.75) 

n = 1 

which may also be written as 

U — —MR ■ g (1.76) 

where M is the total mass of the collection, and R is its center of mass. 

(b) Express F (ext \ to ext \ Ts exd in terms of M, g, R for this collection. 

(c) Which of the following are conserved: E, P, L, S, T 0 , Til 




(2) 

Fig. 1.5. Illustration for Exercise 1.12. Mass m 3 is the third mass in the second collection. Vector 
R (2) is the center of mass of the second collection, and R is the center of mass of the entire system. 



Exercise 1.12 Suppose that a total collection is made up of C sub-collections, labeled by 
the index a — 1, . . . , C. The ath sub-collection has N {a> particles, mass M (a \ momentum 
P 1 " 1 , center of mass R 1 "- 1 , and center-of-mass velocity V ( " ) . (You might think of this as a 
globular cluster made up of stars. Each star is a sub-collection and the whole cluster is the 
total collection.) 



(a) Demonstrate that the center of mass R and momentum P of the total collection may be 
written as 



I L 

R = — V M (a) R ia) 

M ^ 

0=1 



C 

p = p( a ) 
0=1 



(1.77) 



where 



C 

M = J2 M(a) and P (fl) = (1.78) 

a = 1 



i.e. that the total center of mass and total momentum may be calculated by treating each sub- 
collection as a single particle with all of its mass at its center of mass. 
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(b) Let the nth mass of the < 7 th sub-collection m„ <> have location rjf 1 . Define <y Ul> = IT "- 1 — R 
and pj, a> = rj, a ^ — R (fl) so that 



r (a) 

l n 



R + a (fl) + p< Q) 



(1.79) 



Prove the identities 



N w 

E«W = ° 

n = 1 



C 

E M {a) o (a} = 0 

a= 1 



(1.80) 



and use them and their first time derivatives to demonstrate that the total angular momentum 
J may be written as 

c 

J = L + K+E S(fl> (1.81) 

< 3=1 



where 



C 

L = R x MV K = E a(0) x M(a) ° (a) 

a = 1 



Y<“) 

S^EpI^^P? 1 (1-82) 

n= 1 



Note that K is just the spin angular momentum that would result if each sub-collection were 
a point mass located at its center of mass. Then the sum over S (a) adds the intrinsic spins of 
the sub-collections. 

(c) Suppose that a system consists of a massless stick of length b with six point masses, each 
of mass m, held rigidly by a massless frame at the vertices of a plane hexagon centered on 
one end of the stick. Similarly, four point masses, each also of mass m, are arranged at the 
vertices of a plane square centered on the other end. How far from the first end is the center 
of mass of the whole system? Do you need to assume that the hexagon and the square are 
co-planar? 

Exercise 1.13 Consider a system consisting of two point masses, m \ at vector location ri 
and /«2 at r2, acted on only by internal forces f 12 and f2i, respectively. Denote the vector 
from the first to the second mass by r = r2 — ri. For this exercise, use the model in which 
the interaction between m\ and mi is due entirely to these forces. 

(a) Show that Axiom 1 . 4 . 1 , implies that f2i + f 12 = 0 . 

(b) Show that this and Axiom 1 . 5.1 imply that f2i and f 1 2 must be parallel or anti-parallel to 
r (i.e., be central forces). 

(c) Prove that d 2 R/dt 2 = 0 and /x(d 2 r/dt 2 ) = f2i where R is the center of mass and 
!± = m 1 1112/ (tn 1 + m2) is the reduced mass. 

(d) Show that a potential of the form U (ri , r2) = Uof(r) where Uo is a constant and 
r — *Jr ■ r will produce forces fi2 = — 3 C/ / 3 r 1 and f2i = —dU/dri having the required 
properties. 
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Exercise 1.14 Two masses m i and mi are connected by a massless spring of zero rest length, 
and force constant A:. At time zero, the masses m i and mi lie at rest on the x axis at coordinates 
(—a, 0, 0) and ( +a , 0, 0), respectively. Before time zero, a third mass m 3 is moving upwards 
with velocity vo = votj, x-coordinate a, and y-coordinate less than zero. At time zero, m 3 
collides with, and sticks to, mi. Assume that the collision is impulsive, and is complete before 
m \ or mi have changed position. Assume that the three masses are equal, with m\ — mi — 
mi — m . Ignore gravity. 

(a) Using the initial conditions of the problem to determine the constants of integration, write 
expressions for the center of mass vector R and the relative position vector r = r 2 — rj as 
functions of time for all t > 0 . 

(b) Write expressions for iq and r 2 , the vector locations of masses m\ and mi, respectively, 
for all times t > 0 . 

(c) Show that mass m\ has zero velocity at times t n = 2nn^/3m/2k, for n = 0, 1, 2, . . . but 
that the masses never return to the x axis. 

Exercise 1.15 Prove that the V_l in eqn (A. 4) can also be written as V_l = n x (Y x n). 
Exercise 1.16 Use eqn (A.61) to derive the related identity 

(A x B) x C = B (A • C) — A (B • C) (1.83) 

and show that the triple cross product is not associative. 
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If modern mechanics began with Newton, modern analytical mechanics can be said 
to have begun with the work of the eighteenth century mathematicians who elab- 
orated his ideas. Without changing Newton’s fundamental principles, Euler, Laplace, 
and Lagrange developed elegant computational methods for the increasingly complex 
problems to which Newtonian mechanics was being applied. 

The Lagrangian formulation of mechanics is, at first glance, merely an abstract 
way of writing Newton’s second law. Someone approaching it for the first time will 
possibly find it ugly and counterintuitive. But the beauty of it is that, if ugly, it is 
terminally ugly. When simple Cartesian coordinates are replaced by the most gen- 
eral variables capable of describing the system adequately, the Lagrange equations do 
not change, do not become any more ugly than they were. The vector methods of 
Chapter 1 fail when a mechanical system is described by systems of coordinates much 
more general than the standard curvilinear ones. But such cases are treated easily by 
Lagrangian mechanics. 

Another beauty of the Lagrangian method is that it frees us from the task of keep- 
ing track of the components of force vectors and the identities of the particles they 
act on. The whole of mechanics is reduced to an algebraic method. Lagrange himself 
was proud of the fact that his treatise on mechanics contained not a single figure. 8 

2.1 Configuration Space 

In Chapter 1, the position of the «th point particle is given by the vector 

r /r — a)/ i e i 4- x n 2&2 T V/ 13 C 3 (2.1) 

where x n \, x„ 2 , *«3 are its x,y,z coordinates, respectively. Lagrangian mechanics, 
however, uses what are called generalized coordinates, in which a particular coordinate 
is usually not tied to a particular particle. These generalized coordinates may be any 
set of independent variables capable of specifying the configuration of the system. 
Taken together, they define what is called configuration space. 

For example, the simplest set of generalized coordinates is what we will call the 
s-system. Imagine all the Cartesian coordinates of N point masses listed in serial order, 



8 In the preface to his Mechanique Analytique , Lagrange wrote, “No diagrams are found in this work. 
The methods that I explain in it require neither constructions nor geometrical or mechanical arguments, 
but only the algebraic operations inherent to a regular and uniform process. Those who love Analysis will, 
with joy, see mechanics become a new branch of it and will be grateful to me for thus having extended its 
field.” See Chapter 11 of Dugas (1955). 
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as in 

*11, *12, *13- *21, *22, *23, *31, • • • , *W1, *V2, *V3 (2.2) 

and define the corresponding s,- generalized coordinates as 

3'1 , «2, ^3, ^4, s 5, ^6, s 7> • • ■ , *D-2, ^D-l, (2.3) 

where D = 3 M is called the number of degrees offi-eedom of the system. Thus ,v i = *n, 
s 2 = *12, S3 — *13, 54 = *21, . . S7 — *31, etc. For example, 55 is the y-coordinate of 
the second particle. 

Similarly, the force acting on the nth particle is 

f n — fnlh + fn2^2 + frth (2.4) 

and we can define the generalized forces in the s-system, Fj, by the correspondence 
between the lists 



/ll, /l2, /l3> fll, /22, /23, /31, ■ • • , fN I, /v2, /v3 



( 2 . 5 ) 



and 



F), F 2 , F3, F4, F$, F(,, Fj, . . . , Fd- 2, Fd-i, Fd 



( 2 . 6 ) 



Masses may also be relabeled by means of a correspondence between the lists 



m 1, mi, mi, m2, m2, m2, ”13, . ■ . , myv, m/y, m/y 



( 2 . 7 ) 



and 

.V/ 1 . M2, M3, M4, M3, Mf,, M 7 . . . . , M o—2, M d— 1 , Md (2.8) 

Note that M\ = Mi — M3 = mi, M4 = M5 — M(, — m2, etc. 

With these definitions, Newton’s second law can be written in either of two equiv- 
alent ways, the vector form from Chapter 1 , or the equivalent form in the s-system, 



in =■ 



d~ r„ 
' dt 2 



or = M, 



d-s, 

dt 2 



( 2 . 9 ) 



where n — 1 , . . . , N and i — 1 The content of these two equations is identical, 

of course, but the second equation treats all coordinates equally, without reference to 
the particular particle that a coordinate belongs to. 

Other physical quantities can be expressed in the s-system notation. For example, 
corresponding to the vector definition p„ = m„ v„ for n = I ..... /V, the generalized 
momentum can be defined, for all / = 1 .... , D, by 



Pi = Mi St 



( 2 . 10 ) 



where .sy — dsi/dt is called the generalized velocity. Then eqn ( 2 . 9 ) can be written in 
s-system notation as 




( 2 . 11 ) 



for ; = 
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The kinetic energy defined in eqn (1.25) can also be written in the two equivalent 
ways, the first from Chapter 1, or the second using the s-system coordinates and 
masses, 

1 N ! D 

T=-J2m n v 2 n or T = (2.12) 

n = 1 i=l 

2.2 Newton’s Second Law in Lagrangian Form 

In Section 1.8 of Chapter 1, we noted that the total force on the nth particle can often 

be derived from a potential function U(r\, r.y . t). Here, we are going to allow for 

the possibility that some, but perhaps not all, of the force on a particle can be derived 
from a potential so that 



f„ = -v„t/(n , r 2 , . . . , r N , t ) + if P) (2.13) 

where 

V„ = ^ = e 1 -^- + e 2 -^- + e 3 -^- (2.14) 

OTn OXn 1 0%n2 OXn3 

and superscript “NP” means that ff P) is that part of the force that is Not derived from 
a Potential. Expressed in the s-system notation, eqn (2.13) becomes 

Ft = —?-U(si,S 2 , ...,s D ,t)+ f NP) (2.15) 

as, 

where i — 1 , . . . , D, and U(s\, . . . , sp, t) is obtained by writing U( . . . , r n,t) out in 
terms of its Cartesian coordinates and then using the correspondence between eqns 
(2.2, 2.3) to translate to the Sj variables. Using eqns (2.11, 2.15), Newton’s second 
law can now be written as 

^ = —^U(si,S 2 , ...,s D ,t) + Ff P) (2.16) 

dt oS( 

for i = 1 , D. 

To obtain the Lagrangian form of Newton’s second law, define the Lagrangian 
L = L(s, s, t) as 

L(s, s, t) = T(s) — U(s, t) (2.17) 

In expanded form, this is 

1 D 

L — L{S\, S 2 , »..,SD,S\,S 2 , SD,t) = 2 T, M j' S ) -U(s\,S 2 , SD, t) (2.18) 

L 1=1 

Then it follows that 



— L(S1,S2, • ■ • , SD, Sl,S2, SD,t) = Mi Si = Pi 

O Si 



(2.19) 
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and 

9 9 

— LOi, S2, ... ,SD,h,S 2 , SD. t ) = - — U(si, S2, ...,S D ,t) (2.20) 

OSi OSi 

so that eqn (2.16) may be rewritten as 

d_ / 9L(s,s, t) \ _ dL(s,s,t ) = (NP) 

dt V 3 s t ) dsi '■ 1 ‘ J 

for 1 = I , . . . , This is the Lagrangian form of Newton’s second law, as expressed in 
the s-system of coordinates. Note that we have used the usual shorthand, abbreviating 
L(si, . . . , sp, si, , sd, t) to the shorter form L(s, s, t ). 



2.3 A Simple Example 

Suppose one particle of mass m is acted on by a spherically symmetric, harmonic 
oscillator force attracting it to the origin. Then 

L = ^ + M 2 s\ + M^s-fj - + s\ + (2.22) 

But, in problems this simple, it is often clearer to replace s i, S 2 , S 3 by x, y, z, h.h, ^3 
by x, y, z, and M \ , M 2 , M 3 by m, giving 

L = + y 2 + z 2 ^j - (x 2 + y 2 + (2.23) 

We can use this more familiar notation while still thinking of the s-system in the back 
of our minds. Then eqn (2.21) becomes 



For 1 = 1: 



For 1=2: 
For 1 = 3: 



d / 3 L(s, s, t) \ 
dt \ dx ) 

d / dL(s, s, t)\ 
dt V 3 y ) 

d / 3 L(s, s, 1)\ 

dt V 3 z J 



dL(s, s, t) 
dx 

9L(s, s, t ) 
3y 

dL(s, s, t ) 
3 z 



or mx + kx = 0 
or my + ky — 0 
or mi + kz = 0 (2.24) 



which are the correct differential equations of motion for this problem. 



2.4 Arbitrary Generalized Coordinates 

The generalized coordinates of the s-system are only a trivial re-labelling of Cartesian 
coordinates. The real power of the Lagrangian method appears when we move to 
more general coordinate sets. 

Let q\, < 72 , • • • , qo be an Y set of D independent variables, which we will call the 
q-system, such that their values completely specify all of the s-system values, and vice 
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versa. We write each of the s,, for i — 1, . . . , D, as a function of these q variables and 
possibly also the time t, 

Si = Sj (q \ , q 2 , qD,t ) (2.25) 

The only restriction placed on the set q\, q2, . . . , qD is that eqn (2.25) must be invert- 
ible in an open neighborhood of every point of configuration space, so that we can 
write the inverse relations, for k — 1 D, 



qk — qk(si, S2, ■■■ , SD , t) 



(2.26) 



As proved in Theorem D.24.1, the necessary and sufficient condition for this inversion 
is the Jacobian determinant condition 



9s (q, t) 
3 q 



7^0 



(2.27) 



where the D x D Jacobian matrix (3 s(q, t)/dq) is defined, for i,k = 1, .... D, by 9 



/ 3 s(q, t) \ _ dsj(q , t) 

V 3 q Jik d qk 



(2.28) 



Generalized coordinates q which obey eqn (2.27) at every point 10 will be referred to 
as good generalized coordinates. 

Note that we may define a matrix (3 q(s, t)/ds) by using eqn (2.26) to write its 
matrix elements, for i,k = 1, . . . , D, as 



/ 3 q(s, t) \ _ dq k (s , t) 

V 9 s ) ki 3 Si 



(2.29) 



It follows from eqn (2.27) and the discussion in Section D.25 of Appendix D that 
matrix (3 s(q, t)/dq) has eqn (2.29) as its inverse matrix 



/ 3 s(q. t) \ 1 _ / 3 q(s, t) \ 

V 3 q ) 9s ) 



so that 



9sj(q, t) 3 qk(s, t) _ 
9q k 3 s, ~ n 



(2.30) 



In the next four sections, we derive some important relations between the s- and 
q-systems. 11 Then, in Section 2.9 we will prove the main result of this chapter: The 
Lagrange equations in a general q-system have the same form as that derived in eqn 
(2.21) for the s-system. 

9 Note that here, and throughout the chapter, we often use the shorthand notations q = q \, . . . , qj) and 
s = ji, so in which a single, unsubscripted letter stands for a set of variables. 

10 In practice, this condition may be violated in regions whose dimensionality is less than D. For example, 
in the transition to spherical polar coordinates, the condition is violated on the whole of the z-axis. Such 
regions may be excluded, and then approached as a limit. 

1 1 Of course the q-system, being general, includes the s-system as a special case. But we will continue to 
refer to these two systems in this and the next few chapters to illustrate the methods of transformation be- 
tween systems. The s-system is particularly important because of its close relation to Cartesian coordinates. 
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2.5 Generalized Velocities in the q-System 

In Section 2.1 we defined Sj = dsj/dt as the generalized velocities in the s-system. 
A similar definition, q k — dq k /dt, is used for generalized velocities in the q-system. 
Since in eqn(2.25) depends only on q,t, the relation between s and q takes a 
simple form. 

Using the chain rule to differentiate eqn (2.25) with respect to the time allows 
Sj — dsj /dt to be expanded as a function of q and its time derivatives. The expansion 
is 



Si 



dsj(q, t) 
dt 



D 



E 



dsj(q, t ) dq k 
dq k dt 



dsi(q,t ) ^^dsj(q,t). ds,(q,t) 

dt “ E dq k qk + dt 

k=l 



(2.31) 



for each i = I , I), where the generalized velocities in the q-system are denoted 
in the last expression by q k — dq k /dt. Inspection of eqn (2.31) shows that each sj 
depends on q, t through the dependency of the partial derivatives on these quantities, 
and on q due to the q k factors in the sum. Thus 



Sj = Sj(q , q, t) = s i (q ll q 2 , ...,qD, <?l> < 72 . ■ ■ ■ , <?£>U) (2.32) 



2.6 Generalized Forces in the q-System 

Given the generalized force Fj in the s-system, the generalized force Qk in the q-system 
is defined as 



^ dsj(q, t) . 3 q k (s,t) 

Q k — Fj — - — - — with the inverse F, — 2 J Qk - 



1=1 



k=\ 



dsj 



(2.33) 



The reason for this definition will become apparent in Section 2.9. 
Substituting eqn (2.15) into this equation gives 



Qk = ~ E 



dU (s, t) dsj (q, t) 



i = 1 



dsj 



dqk 



i = 1 



(NP) 3 Sj ( q,t ) 

dqk 



(2.34) 



If we consider the potential U(q,t ) in the q-system to be the same function as U ( s , t) 
but expressed in the q, t variable set, then substitution of eqn (2.25) into U (s, t) gives 



U = U(q,t) = U (si(q, t ), s 2 (q, t) s D (q, t), t ) 

Thus the chain rule expansion of the compound function gives 

3 U(q,t) dU(s, t) dsj(q, t ) 



3 q k 



1=1 



dsj 



Equation (2.34) then becomes 



Qk — ~ 



dU (q, t) 

dq k 



3 qk 



Q ( P 



(2.35) 



(2.36) 



(2.37) 



where we have defined 2[ NP) to be the q-system generalized force corresponding to 
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MNP) 



according to the rule defined in eqn (2.33), 



ef P) 



=s>; 

1=1 



(NP) dsj(q, t) 

dqk 



with the inverse F- 



(NP) 



= £e 

k= 1 



(NP) dqk(s, t ) 
3 Si 



(2.38) 



2.7 The Lagrangian Expressed in the q-System 

We have defined the Lagrangian L(s,s,t) in the s-system in eqn (2.18) above. The 
Lagrangian in the q-system L(q,q,t) is defined to be the same function, but expressed 
in terms of the q , q, t variable set. 12 Substituting eqns (2.25, 2.32) into L(s , i, t ) gives 
the Lagrangian as a compound function of q, q, t, 

L(q, q,t) = L (.S’ | (q, t), s 2 (q, t), s D (q , t), s\(q, q , t), s 2 (q, q,t),..., s D (q, q, t), t) 

(2.39) 

Equation (2.18) and the expansion in eqn (2.31) then give the Lagrangian in the 
q-system in an expanded form 



1 ° 

L = L[q, q, t ) = - ^ Mj ( sj(q , q , t))~ - U (si(q, t), s 2 (q, t) s D (q, t), t) 



D 



= t£ m / £ 



;=i 

3 S/(q, t ) . dsj(q, t) 



1=1 



U=l 



dqk 



-qk + 



3 1 



dSj(q, t) . dsj(q, t) 
£ £ 91 + 



\l= 1 



3 qi 



3 1 



- U (siiq, t),s 2 (q , t ), . . .,s D (q, t ), t) (2.40) 



where each .v ; factor has been replaced by a separate sum. Exchanging the order of 
the finite sums and collecting terms then gives 



L — L(q , q, t) = T 2 (q, q , t) + 7) by, q , t) + Tolg, r) - t/(g, t) (2.41) 



where the kinetic energy is broken down into three terms 



T(q, q, t) = T 2 (q, q , t ) + T\(q, q, t) + Toiq, t) (2.42) 



where 

1^^ dsj(q, t) ds;(q, t) 

Ti(q , q , t) = - > > m kl (q , with m kt (q,t) = > Mj — 

ft 9 ® 

(2.43) 

is homogeneous of degree two in the set of variables qi,q 2 , qo, 

T\(q,q,t) = 'Y^,n k (q,t)q k with 0 = £ Mj dSj ^ q ' ^ 9 ^ g ’ ^ (2.44) 

^ ' 9 « 9f 

* 2 We follow the physics custom which uses the same letter L in both the s and q systems, and considers 
L(s , s, t ) and L(q, q, t) to be the same function expressed in different coordinates. See the discussion in 
Section D.5. 
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is homogeneous of degree one in the same variables, and 



To(q,t) = 



7=1 



dsj(q, t) 
dt 



and 

U(q,t) = U (si(q,t),s 2 (q,t), . . . , s D (q, t), t) 
are independent of the generalized velocities q. 



(2.45) 



(2.46) 



2.8 Two Important Identities 

The proof of the form invariance of the Lagrange equations in Section 2.9 requires 
the following Lemma. These two identities are formal consequences of eqn (2.31) and 
the properties of partial derivatives, and are true only because of the simple form of 
eqn (2.25) in which the Sj = s ( - (q , r) depend only on q and t. 

Lemma 2.8.1: Identities in Configuration Space 

It follows from the expansion in eqn (2.31) that the following two identities hold, 

9sj(q, q, t) _ dsj (q , t) ^ dsj(q, q, t) _ d_ / dsj(q, f) \ ^ 

3 qk 3 qk 3 qk dt \ dq k ) 

Proof: The first of these follows immediately from the fact that both 3 sj(q, t)/dq k 
and dsi(q, t)/dt in eqn (2.31) are functions only of q , t, so that the explicit linear term 
in q k is the only place that the variables q appear. The partial derivative of f (q. q. t ) 
with respect to q k is thus the coefficient of the q k in eqn (2.31), which proves the first 
of eqn (2.47). 

The second identity in eqn (2.47) requires a somewhat longer proof. From eqn 
(2.31), the left side of this second equation may be written as 



3 Si(q, q , t) 

3 q k 




^dsiiqjY^ 



(2.48) 



The right side of the second equation may be expanded by noting that, for any func- 
tion g(q, t ), 



dg(q, t) 

dt 



3 g(q,t) . , 3 g(q,t) 



(2.49) 



Setting g(q, t ) = 3 sj(q, t)/dq k thus gives the right side of the second equation as 



d_ / 3 Sj(q, t) \ _ _3_ / 9s,-(q, f) \ 8_ / 3 s,-(q, t) \ 

dt \ 3 qk ) dqi V dq k ) 3 1\ 3 q k ) 



(2.50) 



which is equal to eqn (2.48) when the order of partial derivatives is exchanged. □ 
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2.9 Invariance of the Lagrange Equations 

We now come to the main theorem of this chapter: The Lagrange equations are form 
invariant under a change of generalized coordinates. 



Theorem 2.9.1: Invariance of Lagrange Equations 

Assume that a change of coordinates is madefi-om the s-system to the q-system (assumed 
to be any good generalized coordinates ), as defined by eqn (2.25). Define the Lagrangian 
function in the q-system by eqn (2.39), and the non-potential generalized force in the 
q-system by eqn (2.38). Then the Lagrange equations in the s-system, 

d_ / 3 L(s,s,t) \ _ dL(s, s, t) = (NP) . 

dt V 3 si ) 3 si '■ 1 ‘ 



hold for all i = l. . . . , D if and only if the Lagrange equations in the q-system, 



d_ / 3 L(q, q, t) \ _ 3 L(q, q, t) _ (NP) 

dt V 3 qk ) 3 qk k 



(2.52) 



hold for all k = 1, D. 



Proof: We first prove that eqn (2.51) implies eqn (2.52). Multiplying both sides of 
eqn (2.51) by 3,q (q, t)/dqk and summing over i gives 

3 Sj(q,t) d / 3 L(s,s,t) \ dsj(q,t) dL(s,s,t) dsj(q,t) (NP) 

3 qt dt \ dsi ) “ 3 qk 3 s,- “ 3 qk ' 

If / and g are any functions, it follows from the product rule for differentiation that 
/ (dg/dt) = d(fg)/dt — g(df/dt). Applying this rule with / = dsj(q, t)/dqt and 
g = 3 L(s, s, t)/dsi allows the first term in eqn (2.53) to be rewritten as 



E 

i=l 



dsiiq, t) d / 3 L(s, s, t) 



dqk dt 



- 



D u 

E d 

dt 



i = 1 



3 si 

dsiiq, t) dL(s, s, t) 



dqk 



dsi 



D 

E 



D A 

E d 

dt 



i=l 



i = 1 

3 Sj(q, t) dL(s, s, t) 



dqk 



3 Si 



3 L(s, s,t) d / 3 Sj(q, t) 
3 Sj dt \ dqk 

D 

-E 

i=l 



3 L(s, s, t ) dsiiq, q , t) 



3 si 



dqk 



(2.54) 



where the first and second identities in eqn (2.47) were used to rewrite the first and 
second terms on the right side of eqn (2.54), respectively. 

Thus, rearranging terms slightly and using eqn (2.38) to replace the term on the 
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right by <2[ NP) , eqn (2.53) may be written as 




dsi (q, t) dL(s, s, t ) dsiiq , q , t ) ^ 

3 at, dsi dq k t 



i = 1 



- er 



(2.55) 



But the first parenthesis in eqn (2.55) is the chain rule expansion of i)L(q. q , t)/dq k 
where L(q,q,t) is the compound function defined in eqn (2.39). And the second 
parenthesis in eqn (2.55) is the chain rule expansion of dL(q, q, t)/dq k . Thus eqn 
(2.55) becomes 

d / dL(q,q,t)\ 3 L(q,q,t) 

dt \ 3 qk ) 3 qk 



or 



(2.56) 



which is the same as eqn (2.52), as was to be proved. 

To prove the converse, that eqn (2.52) implies eqn (2.51), we start from eqn (2.56) 
and reverse the chain of algebra to arrive at eqn (2.53). Multiplying that equation by 
3 qk(s, t)/dsj, summing over k = 1 , D, and using eqn (2.30) gives 



d / dL(s, s, t ) 
dt \ dsj 



dL(s, s, t) _ ^.(NP) 
3 s/ i 



which is identical to eqn (2.51), as was to be proved. 



(2.57) 

□ 



2.10 Relation Between Any Two Systems 

The q-system above is taken to be any good system of generalized coordinates. If 
we imagine it and any other good system, which we may call the r-system, then it 
follows from what we’ve done above that the Lagrange equations in this r-system are 
equivalent to the Lagrange equations in the q-system. Both of them are equivalent to 
the s-system, hence they are equivalent to each other. But it may be useful to state 
explicitly the relations between the q- and the r-systems. We state these relations 
without proof, since their proof follows the pattern just established in going from the 
s-system to the q-system. 

The transformation between the q- and r-systems is 



q k = q k (r, t ) and the inverse rj — r/{q, t) (2.58) 



Since both the q- and the r-systems are good generalized coordinates, the determinant 
conditions for transformations between them are 



3 q (r, r) 
dr 



#0 



and 



3 r{q, t) 
3 q 



7^0 



(2.59) 
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All generalized forces in the q-system Q k are related to those in the r-system Rj by 
R, — ^ Qk dqk ^' r) and the inverse Q k = ^ Rj d ’2 q ' n (2.60) 

' ti dr i U dqk 



The Lagrangian in the r-system is defined as the compound function obtained by 
substituting q k — q k (r, t) and q k — q k (r, r, t) into L (q,q,t) as 



Then 



L(r, r,t) = L ( q(r , t), q(r , r, t), f) 

d / dL(q, q, t)\ 3 L{q,q,t) 

dt V dqk ) 



dqk 



= or 



for all k — I if and only if 



d_ ( d L(r, r, t) ^ _ 3 L(r, r, t) _ ^ (NP) 



dt \ dr 



J 



dr: 



(2.61) 

(2.62) 

(2.63) 



for all 7 = 1,.. 



D. 



2.11 More of the Simple Example 

Suppose that the simple example of Section 2.3 is transformed to a q-system consist- 
ing of spherical polar coordinates. Choose q\ = r, qi = 9, q 3 = </>. Then for i = 1. 2, 3, 
respectively, the equations sj = .y, (q. t) in eqn (2.25) take the form 

x = r sin 9 cos </> y = r sin 9 sin0 z = r cos 9 (2.64) 

and the equations Sj = Sj (<yi , qz, qD< q\, qi, ■ ■ ■ , qD - 0 of eqn (2.32) are, again for 
i — 1, 2, 3, respectively, 

x — r sin 9 cos (p + r9 cos 9 cos <j> — r sin 9 <p sin </> 
y = r sin 9 sin cp + r9 cos 9 sin cp + r sin 9 (p cos cp 
Z = r cos 9 — r9 sin 9 (2.65) 

Note that these equations are linear in the dotted variables, as advertised in eqn 
(2.31). Substituting eqns (2.64, 2.65) into the Lagrangian of eqn (2.23) following the 
recipe given in eqn (2.39), we obtain, after some simplification, 



L = 



L(q, q , t) — -m (V 2 + r 2 9 2 + r 2 sin 2 9(p 2 ^ — - 



- kr 2 



( 2 . 66 ) 



The three Lagrange equations eqn (2.52) are then, for k = 1, 2, 3, respectively, 



k _ 1 . d_ ( d L(q,q,t) \ _ 
dt \ dr ) 



k= 2: — 

dt 



k — 3: — 

dt 



d ( 3 L(q, q, t) 
3 ~9 

3 L(q, q, t ) 



3 L(q, q,t) - 2 1 "i 

=0 or mr — mr9 — mr sin - 9 (p~ + kr — 0 

dr 

3 L(q, q , t) 



dtp 



3 9 

3 L(q, q , t) 



— 0 or — 
dt 



( mr 2 9 ^ 



9 • 9 

mr sin 9 cos 9 <p~ — 0 



= 0 or — ( mr 2 sin 2 



dt 



(^mr 2 sin 2 9 <pj — 



= 0 



(2.67) 



which are the correct equations of motion in the q-system. 
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2.12 Generalized Momenta in the q-System 

In eqn (2.19), the generalized momenta P, = M, f in the s-system were derived from 
partial differentiation of the Lagrangian, P, — dL(s,s,t)/dsj. The generalized mo- 
menta in the q-system are defined by a similar partial differentiation, 

. 3 L(q,q,t) 

Pk = Pk (q , q, t) = — (2.68) 

oqk 

The expansion of the Lagrangian in eqns (2.41 - 2.46) shows that this p k can be 
expanded as 

D 

Pk(q,q,t) = y'mkiiq, t)qi + n k (q,t) (2.69) 

l=i 

A transformation law can be found between the generalized momenta in the s- and 
q-systems. Using eqns (2.39, 2.68) and the chain rule gives 



Pk 



dL(q, q , t) 

dq k 



D 



E 



3 L(s, s, t) dsj(q, q , t) 
3 Si dqk 



D 



E* 



dsj(q, t) 
dqk 



(2.70) 



where eqn (2.19), and the first of eqn (2.47) from Lemma 2.8.1, have been used in 
the final expression. Using eqn (2.30), the inverse relation can also be written 



D 



Pi = E«- 

k= 1 



dq k (s, t) 
3 Si 



(2.71) 



The pair of quantities q k , p k are referred to as conjugates. The p k is called the 
conjugate momentum of coordinate q k , and the q k is called the conjugate coordinate 
of momentum p k . The same nomenclature is applied also to the pair .v, , P,, and to 
similar pairs in any system of coordinates. 



2.13 Ignorable Coordinates 

The Lagrange equations in the general q-system, eqn (2.52), may be written in the 
form of two coupled equations, 



Pk 



3 L(q, q, t ) 

dq k 



er 



and 



Pk 



dL(q. q , t) 
dq k 



(2.72) 



If 2 j NP) = 0 and 3 L(q. q , t)/dq k = 0 for a particular k value, then we say that the vari- 
able q k is ignorable. In this case, its conjugate momentum p k is said to be conserved, 
which means that its time derivative vanishes and hence it is equal to a constant 
which may be taken to be its value at t = 0. If q k is ignorable, then 



p k (t) — C k = p k { 0) 
For example, variable f in Section 2.11 is ignorable. 



(2.73) 
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2.14 Some Remarks About Units 

Notice that the generalized coordinates in the s-system all have units of length. But 
the generalized coordinates in the q-system may have other units. For example, in the 
simple example in Section 2.11, the variables qj and <73 are angles and hence unitless. 
However, there are certain products that will always have the same units, regardless 
of which system is used. 

Using eqns (2.25, 2.70), with the notation Sq for a differential at fixed time with 
St = 0, the chain rule gives 

D 

= E PiSs ' 

i = 1 

(2.74) 

It follows that the units of each product p/p/k must be the same as the units of the 
products PjSi, which are ML 2 /T, the units of what is called action. Thus, in the simple 
example of Section 2.11, the p 2 and 773 generalized momenta are seen to be angular 
momenta, which have the same units as action. 

Similarly, denoting differentials with time fixed by Sqk and <$,q, eqn (2.33) and the 
chain rule show that 

D 

= p i Ss i 

i= 1 

(2.75) 

It follows that each product Qkqk must have the same units as the products /-) ,v, , 
which are ML 2 /T 2 , the units of work and energy. Thus, in the simple example of 
Section 2.11, the Qi and Q 3 generalized forces are torques, which have the same 
units as work. 

The results in this section can be very useful, allowing a unit check of sorts to be 
performed even in complex Lagrangian systems for which the units of the qk may be 
very strange. 



D 



E QkSqk =E 



k= 1 



k= 1 \/=l 



dsj(q, t) 

dqk 



^ = e4e^W 



i=i 



\k= 1 



dqk 



E = E ( E P < 



jfc=l 



k = 1 \i=l 



dsj(q, t) 
dqk 



Sqk = E P ‘ 



$ s i (#> t) 

Sc lk 



i = 1 



\k= 1 



dqk 



2.15 The Generalized Energy Function 

We have defined generalized coordinates, velocities, and momenta. We now define 
what may be thought of as a generalized energy. The generalized energy function 
(sometimes called the Jacobi-integral function) H q in a general q-system is defined to 
be 



dL(q. q, t) 

H q = H q (q,q,t) — ) — q k - L(q,q,t) = ) p k (q, q, t)q k - L(q, q, t) 

dqk 



k= 1 



k= 1 



(2.76) 



The generalized energy function in the s-system is defined similarly, 



D . D 

H s - H s (s, s,t) = E ’ ' Si - L(s, s,t) = E p i( s > s, t)si - L(s, s, t) (2.77) 

O Si 



i = 1 



1=1 
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The subscripts on H q and H s are to emphasize that, unlike the Lagrangian function in 
the s- and q-systems, the H q and H s are not in general the same function. One cannot 
go from one to the other by simply making a coordinate substitution as we did for L. 

Theorem 2.15.1: The Generalized Energy Theorem 

The total time derivatives of the generalized energy functions are given by 



dH q 

dt 

dH s 

dt 



= 4 = E er « 

k=\ 

D 



=*=2>; 



(NP). _ 

* l 



i = 1 



3 L(q, q, t ) 
3r 

3 L(s, s, t) 
dt 



(2.78) 

(2.79) 



Proof: The proof will be given for the q-system since the s-system proof is identical. 
From eqn (2.78), 



H q ( pkqk + pk'qk ) - 



k= l 

D 



dL(q , q, t ) 
dt 



k= 1 



= X! ( + pk ' c n 



dL(q,q,t). dL(q,q,t)..\ dL(q,q,t ) 



3<yt 






3<jt 



-qk 



dt 



(2.80) 



Using eqn (2.68) to cancel the ^ terms, and eqn (2.72) for pk, gives eqn (2.78) as 
was to be proved. □ 



Equations (2.78, 2.79) are generalized work-energy theorems. If the non-potential 
forces vanish identically for all index values, and if the Lagrangian does not contain 
the letter t explicitly, then the generalized energy function will be conserved. For ex- 
ample, in the q-system 0[ NP) = 0 and 3 L(q, q, t)/dt = 0 would imply that H q = 0, 
and hence that 

H q (q(t), q(t ), t) = C = H q {q{ 0), 4(0), 0) (2.81) 



2.16 The Generalized Energy and the Total Energy 

One can easily show using eqns (2.18, 2.77) that 

1 ° 

H s = - Y. M i' s ) + U(.si,S 2 , ...,s D ,t) = T + U = E (2.82) 

7 = 1 

where E is identical to the total energy defined in Section 1.8. So the s-system gener- 
alized energy function H s is equal to the total energy. 

The situation is different in the general q-system, however. Using eqns (2.41, 
2.76), 



H, 



D 

k = 1 



dT 2 (q, q, t) 
dqk 



D 

jfc=l 



3Ti(g, q, t) 

3 qk 



D 

k= 1 



dTp(q, q, t) 
dqk 



L(q , q, t) (2.83) 



Since functions 7} are homogeneous of degree / in the generalized velocities qk, the 
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Euler condition from Theorem D.31.1 shows that 



dT 2 (q,qH) 

X!® TT = 2T 2 (q, q, t) 



k=\ 



dqk 



• 97i (q,q,t) . 

2_qk 5 -T = T\(q, q,t) 



k=\ 



dqk 



. dTo(q , q , t) 

2_^ c lk — — = 0 



k= 1 



dqk 



(2.84) 



Therefore 

H q = 2 T 2 + Ti-(T 2 + T 1 +T 0 -U) = T 2 -To + U (2.85) 

and H q is related to H s = T + U by 



H q = (T + U)~ (T\ + 2 T 0 ) = H S - (7i + 27b) = E — (7\ + 27b) (2.86) 

which is not in general equal to the total energy E. 

Examination of eqns (2.44, 2.45, 2.86) shows that the condition for H q to equal 
H s — T + U is for the coordinate transformation equation not to contain the letter t 
explicitly. Then dsj (q, t)/dt — 0, which in turn implies that both 7) and 7o are zero. 
Thus 

^ — 0 implies that H q = H s = T + U (2.87) 

Note to the Reader: The condition for H q to be conserved (which, in the absence 
of non-potential forces, is dL(q,q,t)/dt — 0) is independent of the condition for 
H q — T + U (which is 3 s,(q, t)/dt — 0). The H q may be conserved even when the 
total energy E is not. 

The generalized energy function is most useful in problem solutions when it is 
conserved. And if H q is conserved, it usually makes little difference to the prob- 
lem solution whether or not H q equals T + U. For conservation implies the equation 
H q (q, q.t) — C, a first-order differential equation and a first integral of the equations 
of motion, regardless of the relation of H q to the total energy. 



2.17 Velocity Dependent Potentials 

The problem of N charged particles in a given, externally applied electromagnetic 
field can also be reduced to Lagrangian form. We use the s-system of generalized 
coordinates, expressed in vector notation. 

The Lorentz force acting on the «th particle is 

(ch) 

f„ = g <ch, E(r„ , 0 + — v„ x B(r„ , t) (2.88) 

c 

where qj, ch> is the charge of the particle, E(r, t) is the electric field, B(r, t) is the 
magnetic induction field, and 

r„ — X„\t\ + Xn2&2 “t“ and V;; — -in ] C I 4* -b; 2 C '2 + -il;3 C '3 (2.89) 

are the particle’s position and velocity, respectively. 
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Introducing the scalar potential <J>(r, t) and the vector potential A(r, t), the electric 
and magnetic induction fields at the particle location r„ may be written 

30(r„ , r) 13A(r„,f) 

E(r„ , t) = and B(r„ , t) = V„ x A(r„ , t ) (2.90) 

9r„ c at 

where the notation for the gradient vector operator 



3 - 3,-33 

— = ei h ©2 h ©3 

1 <-v 1 ft 1 -J r\ 

OY n OXfi\ OXyi l 



has been introduced. 13 

Substituting eqn (2.90) into the Lorentz force eqn (2.88) gives 



3 

9r„ 



(ch) 

qh 



(ch) 



A(r„ , t) ] - v„ • ( — — A(r„ , t) 

/ 3r„ \ c 



(2.91) 



(2.92) 



where the triple cross product has been expanded, using the usual Lagrangian list of 
variables r„ ,\ n ,t to define the meaning of the partial differentials in the next-to-last 
term. 

Noting that the total time derivative of ^ ch) A(r„, t)/cj can be written, using the 
chain rule, as 



d / qh 



(ch) 



3 / qh 



(ch) 



3 / q'n 



(ch) 



dt \ c 
eqn (2.92) becomes 
3 



A(r„, t) = \ n ■ — A(r„, t) + — A(r„, t) 



3r„ \ c 



3 1 \ c 



f« = - — I ^ ch) 0(r„, t) - v„ • — — A(r„, t) 1 - — I — — A(r„, t) 



(ch) 

qh 



d / q„ 



(ch) 



3r„ 



dt \ c 



(2.93) 



(2.94) 



Defining the velocity dependent potential f/ (vel) by 



N / (ch) \ 

D (vel) (r, v,f) = E ^ ch, 4>(r„, t ) - v„ • ^—A(r n , t)j 



(2.95) 



gives 



3t/( vel) (r, v, t) q„ 



(ch) 



-A(r n ,t) 



3v„ c 

13 See Section A. 11 for a discussion of this notation, and cautions for its proper use. 



(2.96) 
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where the operator 



9 

9v„ 



= ei - 



■ e?: 



+ £?,: 



(2.97) 



9i«i 9-h;i 9i„i 

has been introduced and the identity eqn (A. 71) of Section A. 11 has been used. Thus, 
finally 

f„ = ± ( W<VCll(r ' V - , > ) - (2.98) 



dt \ 



9v„ 



9r„ 



expresses the Lorentz force in Lagrangian form. 

In the present vector notation, the total kinetic energy is given in the first of eqn 
(2.12). Again using the identities in Section A. 11, the equations of motion for this 
problem can thus be written as the following sequence of equivalent expressions 



(m„v„) — f n 
dt 

d /a T\_ d I dU (vel Hr,y,t)\ 9f/ (vel) (r, v, t) 
dt \d\ n J dt \ 9v„ ) 9r„ 



(2.99) 

(2.100) 



and hence 



d /9L(r, v, r)\ 9L(r, v, t) 

dt V 9v„ ) 9r„ 

where the Lagrangian function for velocity dependent potentials is defined as 



( 2 . 101 ) 



L(r, v, t) = T{y) — f/ (vel) (r, \,t) 



( 2 . 102 ) 



Written out, the Lagrangian is thus 

, N N N (ch) 

L = - ^2 m " v n - X! <y« ch,<1> ( r rc, t) + ^ v„ • A(r„, t) (2.103) 

n = 1 n = 1 n= 1 

The generalized momenta of particles in an electromagnetic field are not simply the 
particle momenta p„ = m n \ n . They are 

9L(r, v, t) 

p = = m n \„ + A(r„, t ) (2.104) 

” 9v„ c 

which might be considered as the vector sum of a particle momentum p„ = m„v„ and 
a field momentum ql ch) A(r„, t)/c. It is this generalized momentum that is conserved 
when the coordinate r„ is ignorable. 

The generalized energy function can also be found, 

N j N N 

H s = ^2 v„ • p„ - L = - ^2 m n Vn + X! 4« Ch) ^ ( r » - 1) (2. 105) 

n= 1 n = 1 n = 1 

Note that, even though we are in the s-system, the generalized energy function here 
is not equal to T + U iveV> since the terms linear in the velocity have canceled. How- 
ever, the generalized energy eqn (2.105) is equal to the total energy of the system of 
charges as it is usually defined in electrodynamics. 
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It seems surprising that a complicated velocity-dependent force like the Lorentz 
force of electrodynamics can be written in the Lagrangian form of eqn(2.98). Why 
do electrodynamics and Lagrangian mechanics fit together so neatly? We leave that 
question for the reader to ponder. 

Other velocity-dependent potentials are possible. The general rule for their use 
follows the same pattern as the electromagnetic example. In the s-system with velocity 
dependent potential {/ (vel) = U^ vA \s, s, t), the generalized forces can be defined as 



Fi = — . 
dt \ 

and the Lagrange equations are 

d /3 L{s,s,t)\ dL(s,s,t ) 



_ d / 3f/ (vel) (.y, s, r)\ dU (vel Hs,s,t) 



dsj 



3 s; 



(2.106) 



dt \ dsj 



dsi 



= 0 where L(s, s, t) = T(s) - (/ (vel> (s, s, t) 



(2.107) 

Using the general q-system, the velocity-dependent potential will be £/ (vel) (</. q, t), 
obtained as a compound function from U (vA \s, s, t), 



U ( vfi l) = U {vel) (q, q, t) = f/ (vel) (s(q, t), s(q, q, t ), t) 
The generalized forces are 

_ d ldU^ A \q,q,t)\ 3 U {vel) (q,q,t) 



Qk = -77 



dt \ 



3 qk 



3 qk 



(2.108) 



(2.109) 



and the Lagrange equations are 



d / 3 L{q, q, t) \ 
dt \ dqk ) 



3 L(q, q, t ) 
dqk 



where 



L(q, q , t) = T(q, q, t) — U (vA) (q, q , t) 

( 2 . 110 ) 



The zero on the right in eqns (2.107, 2.110) follows from the assumption that that no 
forces other than those produced by L' fvel) are present. 



2.18 Exercises 

Exercise 2.1 

(a) Calculate the Jacobian matrix for si = x, S 2 = y, S 3 — z and <71 — r,qi — 0, and <73 = 0, 
the transformation from Cartesian to spherical-polar coordinates. Show that r, 0 . (p are good 
generalized coordinates except on the z-axis. 

(b) Work out in detail the derivation of eqn (2.66) from eqn (2.23). 

Exercise 2.2 

(a) Calculate the Jacobian matrix for ,V| = x, S 2 = y, S 3 = z and q 1 — p, q 2 — 0, and 
f/3 = z, the transformation from Cartesian to cylindrical-polar coordinates. Show that p. <p, z 
are good generalized coordinates except on the z-axis. 

(b) Starting from the Lagrangian in eqn (2.23), work out in detail the transformation to the 
Lagrangian L(p, <fi, z, p, 0, z, t) for this system. 
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Exercise 2.3 We are given a Lagrangian L (q, q. t). Assume that there are no non-potential 
forces. Let f (q, t) be an arbitrary function of q = qi, q2, ■ ■ ■ , qN and possibly the time t. 
Show that if qt = q k (t) are a solution of 



d ( dL(q,q.t)\ 3 L(q,q,t) 

dt V 3 qk ) 3 qk 

then the same q^(t ) are also a solution of 

d / 3L' (q, q. t)\ dL' (q,q,t) 
dt \ 3 qk ) dqk 



( 2 . 111 ) 



( 2 . 112 ) 



where 

L'(q,q,t) = L(q,q,t)+ —f{q,t) (2.113) 

dt 

This problem shows that L and L' are equivalent Lagrangians. The same solution will be 
found no matter which one is used. 



Exercise 2.4 Consider a collection that consists of just two masses, m i and m 2 . We can 
define the center of mass R and the vectors pj and p 2 as in Section 1.9. However, the compo- 
nents of these three vectors are not suitable generalized coordinates. For one thing, there are 
nine of them, whereas the number of degrees of freedom D is only six (the six components 
of ri and r 2 ). Suppose that we define a new vector r by r = r 2 — ri and define v = dr/dt as 
its time derivative. Also it will be useful to define a reduced mass fi — mim 2 /(wi + m 2 ). 

(a) Write ri and r 2 in terms of R and r and the appropriate masses. Then show that the six 
components of R and r satisfy the Jacobian determinant condition and so are good general- 
ized coordinates. 

(b) Write P, L, S, T 0 , and T\ in terms of / 1 , M, R, r, Y, and v only. 

Exercise 2.5 Suppose that the two masses in Exercise 2.4 have a motion defined by a La- 
grangian function 

1 9 1 9 

L = 2 miV l + 2 m 2 v 2 -U ( r 2 - IT) (2.114) 

where iq = ^J\\ ■ \\ and V| = dr\ /dt , with similar definitions for the second mass. 

(a) Rewrite the Lagrangian in terms of the variables r, R and their derivatives. Show that this 
Lagrangian can be written as the sum of two terms, one of which depends only on R and its 
time derivative and the other only on r and its time derivative. (Such Lagrangian systems are 
called separable.) 

(b) Show that the three components of R are ignorable coordinates, and that the total momen- 
tum of the system is conserved. 

Exercise 2.6 A mass m is acted on by a force derived from the generalized potential 

f/ (vel) (r, v, t) = U (r) + or ■ L (2.115) 



where 

r — J x 2 + y 2 + z 2 or = cre 3 L = r x m\ (2.116) 

and r and v are the position and velocity of the mass relative to some inertial coordinate 
system. 
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(a) Express t/ < - veI) in Cartesian coordinates (the s-system) x, y, z, x, y, z and find the force F 
(i.e. find the three components F x , F y , F-). 

(b) Now express C^ ve ’ in spherical polar coordinates (which we might call the q-system) 
r. 9, 0 . r , 9, 4> and find the generalized force Qp for k — 1, 2, 3. 

(c) The force vector we found in part (a) can be re-expressed in terms of the spherical polar 
unit vectors as 

F= F r r+F e Q + F^ (2.117) 

where 

F r = r F F 0 = e-F F 0 = <j)- F (2.118) 

Show that Q r is equal to F, . 

(d) However, (Q ( f, is not equal to F,p . Show that (Q ( p, is equal to the "-component of the torque 
t = r x F. 

(e) Verify that the units of Q<f,8(p do obey the rule described in Section 2.14. 

Exercise 2.7 Given an electric potential <f>(r. t) and a vector potential A(r, f), the electric 
and magnetic induction fields can be expressed as 

1 9A 

E = —V <J> B = V x A (2.119) 

c dt 



We know that the E and B fields are left invariant by a gauge transformation of the potentials. 
That is, if 

1 9 Y 

A' = A + V / <J >/ = <J> (2.120) 

c dt 



and E' and B' are found using eqn (2.1 19) but with <t> and A replaced by the primed potentials 
4>' and A', then it can be shown that E' = E and B' = B. 

(a) For the case of a single particle of mass m and charge q (ch \ use eqn (2.103) to write out 
two Lagrangians, one using the original potentials and one using the primed potentials. Call 
them L and L' . 



(b) Find 



P = 



dL (r, v, r) 
9v 



and 




dL' (r, v, t) 
9v 



( 2 . 121 ) 



(c) Show that 



L' = L 



df (r, t) 
dt 



( 2 . 122 ) 



and write /( r, t) in terms of x(r, t). 

(d) If r = rtf) is a solution to the Lagrange equations with L, is it also a solution to the 
Lagrange equations with L'l Should it be? If it is, show why it is, and if not show why it is 
not. 



Exercise 2.8 A a single particle of mass m in one dimension has the Lagrangian in some q 
system of coordinates 



L(qi,qi,t) 




2 2 
m co a 




where a and co are given constants having appropriate units. 



(2.123) 
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(a) Find the generalized momentum pi and the generalized energy function H q (q, q . t ) for 
the q system. Is the generalized energy conserved? 

(b) Suppose that the q system coordinates are related to those of the s system by q\ — a/ s\. 
Write the Lagrangian in the s-system, L(s, s,t). 

(c) Find the generalized momentum P\ and the generalized energy function H s (s, s. t) for 
the s-system. Is the generalized energy conserved? 

(d) Show that the momenta p\ and P\ are related as predicted by eqn (2.71). 

(e) When expressed in the same coordinate system, is H s equal to H q ? Why should it be? 




Fig. 2. 1 . Illustration for Exercise 2.9. 



Exercise 2.9 A horizontal, circular table with a frictionless top surface is constrained to ro- 
tate about a vertical line through its center, with constant angular velocity co o. A peg is driven 
into the table top at a distance a from the center of the circle. A mass m slides freely on the 
top surface of the table, connected to the peg by a massless spring of force constant k and zero 
rest length. Take the s-system to be an inertial system of Cartesian coordinates x, y with ori- 
gin at the center of the table top, and the q-system to be rotating Cartesian coordinates x\ y' 
defined so that ej defines a line passing through the peg. Ignore the "-coordinate, and treat 
this problem as one with two degrees of freedom. The transformation between coordinates of 
the mass in the two systems is 

x = x cos coot — y' sinaW y = x 1 sin&W + y cos coot (2.124) 



(a) Write L(s, s, t ) in the s-system and L(q . q , t) in the q-system. 

(b) Write H s in the s-system. Is it equal to T + C/? Is it conserved? 

(c) Write H q in the q-system. Is it equal to T + {/? Is it conserved? 

Exercise 2.10 A one-dimensional system has the Lagrangian 



. TYl /.90 299 • \ 

L(q, q,t) = — sin“ cot + co qf cos cot + coq\q\ sin 2 cotj — mgq i sin cut (2.125) 



where 0 < t < n /co. 

(a) Find the generalized energy function for the q-system, H q (q, q , f). Is it conserved? 

(b) Make a change of generalized coordinates, with the new coordinate n defined by q\ — 
r\/ sinaif, as in Section 2.10. Write the Lagrangian in the r-system, L(r, r. t). 

(c) Find the generalized energy function for the r-system, H r (r, r, t). Is it conserved? 
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Fig. 2.2. Illustration for Exercise 2.1 1. 

Exercise 2.11 Consider a plane double pendulum with rigid, massless, but possibly extensi- 
ble sticks. It has a mass m\ at coordinates x\, yi and a mass m 2 at x 2 , y 2 - Gravity g = gei 
acts downwards. Ignore the z coordinate in this problem, and assume that all pivots are fric- 
tionless. In the s-system s = xi, yi, X 2 , yi , the Lagrangian is 

L (s, s, t) = ( if + ~ ( ' x\ + yf) + m { gx x + m 2 gx 2 (2.126) 

(a) Consider a change of generalized coordinates to the q-system q — r\,0i, r 2 , (h shown 
in the diagram. Write the four transformation equations of the form .v; = s, (q , t) for i = 
1, . . . , 4. 

(b) Calculate the Jacobian determinant | ds/dq \ for this transformation and find the conditions 
under which the q-system are good generalized coordinates. 

(c) Write the Lagrangian L(q,q,t) in the q-system. 

Exercise 2.12 A point particle of mass m and charge q moves near a very long wire carrying 
a current I . Choose the 63 axis along the wire in the direction of the current. In the region 
near the wire, the vector potential in terms of cylindrical polar coordinates p, <p, z is 

A = --- lnf- P -^ z (2.127) 

27 rc \PoJ 

where po is some arbitrarily chosen p value. Assume the electric potential <I> to be zero. 

(a) Write the Lagrangian L = L(p, <p, z, p, <j), z, t) for the particle, using cylindrical polar 
coordinates. 

(b) Find the generalized momenta p p , p,j , , and p~. 

(c) Write the three Lagrange equations, and show that (p and z are ignorable coordinates. 

(d) Use the tp and z Lagrange equations to write expressions for cp and z as functions of p and 
integration constants. 

(e) Write the generalized energy function. Is it conserved? Use it to express p 2 as a function 
only of p and some constants that can be determined at t = 0. 
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One attractive feature of the Lagrangian method is the ease with which it solves so- 
called constraint problems. But, as the reader will see, applying the correct method 
for a particular problem can be something of an art. We present several different 
ways of solving such problems, with examples of each. With experience, the reader 
will become adept at choosing among them. 

In the previous chapter, the generalized coordinates were assumed to be indepen- 
dent variables. But there are problems of interest in which these coordinates are not 
independent, but rather are forced into particular relations by what are called con- 
straints. For example, the x,y,z coordinates of a point mass falling under gravity are 
independent. But if the mass is forced to slide on the surface of a plane, there would 
be a constraint in a form such as ax + fly + yz — A = 0 tying them together. The 
present chapter shows that such constraints can be incorporated into the Lagrangian 
method in a particularly convenient way. If the constraints are idealized (such as fric- 
tionless surfaces or perfectly rigid bodies), then the equations of motion can be solved 
without knowing the forces of constraint. Also, the number of degrees of freedom of 
the Lagrangian system can be reduced by one for each constraint applied. 



z 




Fig . 3.1. Example of a holonomic constraint. The mass m is constrained to move on the surface 
of a plane defined by n • r = A. Constants a, ft, y are the components of a unit vector 
perpendicular to the plane, and A is the perpendicular distance from the plane to the 
origin. 

3.1 Constraints Defined 

The simplest class of constraints are those called holonomic. A constraint is holonomic 
if it can be represented by a single function of the generalized coordinates, equated 
to zero, as in 

G a (q,t) = o (3.1) 
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for a — 1 , ,C, where each a value is considered a separate constraint. 

The relation among the coordinates may vary with time. For example, if the plane 
in the previous paragraph had a time-varying distance from the origin A (f), the con- 
straint, using the s-system, would be 

Gifs, t ) = ax + fy + yz — A (t) — 0 (3.2) 

If the number of constraints C is greater than one, care must be taken to en- 
sure that the constraint equations are functionally independent. Otherwise the actual 
number of constraints may be fewer than the number listed. As discussed in Theorem 
D.28.1, the condition for functional independence of the constraints is that the C x D 
matrix whose elements are 

/3GX = 3G a ( q ,t) (3 3) 

\3 q/ak dc lk 

must have rank C. In other words, there must be a nonzero C x C determinant, called 
a critical minor, 14 constructed by selecting the C rows, and C of the D columns, of 
eqn (3.3). We will assume throughout that all sets of constraints obey this condition. 



3.2 Virtual Displacement 

In the treatment of Lagrangian constraint problems, it is very convenient to define the 
new concept of virtual displacement. 

Definition 3.2.1: Virtual Displacements Defined 

A virtual displacement of a function f = f(q, t ) is its differential, but with the conven- 
tion that the time t is held fixed so that St — 0. These virtual displacements are denoted 
with a lowercase Greek 8 to distinguish themfi-om normal differentiab. 

The virtual displacement of a function f(q, t) is then, 

Sf 3/ (q.t) 

Sf = > — Sq k (3.4) 

ti dqk 

and the virtual displacement of the constraint function G a (q , t) defined in Section 3.1 
is 

SG, = V (3.5, 

t, 8 ® 

These definitions also apply to the s-system. By the chain rule, virtual displacements 
in the s- and q-systems are related by 

^-^3 Si(q,t) . ^-^3 qk(s,t) 

Ssj — } Sq k and the inverse relation Sq k = > as; (3.6) 

ti dqk dsi 

14 See Section B.17 of Appendix B for a discussion of the rank of a matrix. 
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and the differential of G a can be written equivalently as 



SG a 



D 



E 



dGg(S, t) 
dsj 



8 Si 



(3.7) 



Definition 3.2.2: Virtual Displacements Re-defined 

The definition of virtual displacement in Definition 3.2. 1 is now extended to include 
the condition that the 8qk must be chosen so that , at each instant of time, and for all 
a = 1 , ... ,C, 

8G a = 0 (3.8) 

Virtual displacements at a frozen instant of time must be such that the constraints 
are maintained, that both G a (q,t ) = 0 and G„ (q + 8q, t) — 0. For example, if the only 
constraint is that a single mass must move on a flat, horizontal elevator floor located 
at z = h\t), then the constraint equation is Gifs, t) = z — hit) = 0 and the only 
allowed nonzero virtual displacements are 8x and 8y. The constraint requires 8z to 
equal zero. The virtual displacements are constrained to remain in the instantaneous 
surface of constraint as it is at time t, even though that surface may be moving as t 
evolves. 



p(cons) 




Fig. 3.2. A mass is constrained to slide without friction on the floor of an elevator which 
is moving upwards. The constraint is z = h(t). The virtual displacement 5r is parallel 
to the instantaneous position of the floor, even though the floor is moving. Thus 8z = 0 
as shown. Since the floor is frictionless, F^ cons) is perpendicular to the floor and hence 
,5 W (cons) _ jj(cons) . 5r _ 0 



3.3 Virtual Work 

Generally, constraints are maintained by the actions of forces, like the force exerted 
on the mass by elevator floor in the previous example. We will denote these forces 
of constraint by g[ consl in the q-system or F . cons) in the s-system. These forces of 
constraint (and indeed any generalized forces) in the two systems are related by the 
same transformation formulas as in Section 2.6, 



^.(cons) 9'h (cj i t ) 

^ ‘ dqk 



i=i 



and /r.( cons ) _ ^ g[ cons) ^ 

iti dsi 



(3.9) 
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The virtual work of the forces of constraint is defined as 

D D 

<5IV (COns) = Ff COm) 8si = Qk° nS) 8<lk (3.10) 

1=1 k=l 

where the second equality follows from eqns (3.6, 3. 9). 15 

The problems that can be dealt with easily by Lagrangian theory of constraints are 
those in which, at least as an idealization, the forces of constraint do no virtual work. 
This portentous phrase means simply that 

3W (c ° ns >=0 (3.11) 

for all allowed virtual displacements. For example, if the elevator floor in the previous 
section is made of frictionless ice, then the only constraint force will be a normal force 
p(cons) _ /yf cons) g 3 . Then, since Sz = 0, the virtual displacement 5r = <5.vei + 8ye 2 will 
be perpendicular to the constraint force, leading at once to the conclusion that 

D 

w ( C ons) = ^2 F^ cons) 8si = F (cons) ■ Sr = 0 (3.12) 

1=1 

A wide class of problems can be imagined in which masses slide without friction 
on various surfaces. For example, a coin sliding inside a spherical fish bowl made of 
frictionless ice, with the q-system taken to be spherical polar coordinates, would have 
8r = 0 and g^ cons) = Q^° ns) = 0, leading again to <5VF (cons) = 0. A bead sliding on a 
frictionless wire of arbitrary shape would have a force of constraint perpendicular to 
the wire but virtual displacement only along it, again producing zero virtual work. 

A less obvious example is that the cohesive forces binding the masses of an ideal- 
ized, perfectly rigid body also do no virtual work. The proof of this statement must be 
deferred until the motion of rigid bodies is treated in later chapters. (It is proved in 
Theorem 8.13.1.) Systems of rigid rods linked by frictionless pivots and joints, such as 
the single or multiple pendulum, also have constraint forces that do no virtual work. 
Another important example is that the friction force acting when a wheel rolls with- 
out slipping on some surface does no virtual work, since the force acts at the contact 
point, which does not move in virtual displacements. 

Virtual work is not the same as the real, physical work that may be done by the 
constraint forces. In the example of the elevator floor in Figure 3.2, if the elevator 
is moving upwards then the floor definitely will do real work on the mass as time 
evolves. But it will not do virtual work. The rule is that when the constraints are not 
time dependent, then the forces of constraint that do no virtual work will also not 
do real work. But when the constraints are time varying, with 3 G a (s, t)/dt ^ 0, then 
forces of constraint that do no virtual work may still do real work. 

15 The use of the symbol <5 VU <cons ^ in eqn (3.10) is not meant to imply the existence of a work function 
lV (cons )(q, t) from which the generalized forces of constraint can be derived by partial differentiation. 
Such a function exists only in trivial cases. The forces of constraint take whatever values are necessary to 
maintain the constraints. In general, this means that they depend on the first and second time derivatives 
of the generalized coordinates, as well as the generalized coordinates and the time. 
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3.4 Form of the Forces of Constraint 

The reason for the above definitions of virtual displacement and virtual work is to 
allow us to state the following theorem. 

Theorem 3.4.1: Form of the Forces of Constraint 

Given a system of constraints defined as in eqn (3.1), and virtual displacements obeying 
eqn (3.8), the virtual work of the forces of constraint vanishes, 

W (cons >= 0 (3.13) 



for all allowed virtual displacements if and only if there exist X a factors, called Lagrange 
multipliers, such that the constraint forces can be written in the following form 



n (coris) 



C 
a= 1 



dG a (q, t) 
dqk 



or, equivalently, F ; <cons) 



c 

a = 1 



dGq(S, t) 
dsi 



(3.14) 



where the X a factors in the first of eqn (3.14) are the same as those in the second. 



Proof: The equivalence of the two equations in eqn (3.14) follows from eqn (3.9) 
and the chain rule. 

We prove the theorem in the general q-system. The proof in the s-system is similar. 
First, we prove that eqn (3.14) implies eqn (3.13). The definition eqn (3.10) gives 

(5 W * cons! = e[ COnS W = 

k= 1 k= 1 a= 1 qk 

= E E dG f q ' n 8q k = E x ° SGa (3 - 15) 

a= 1 jfc=l qk a= 1 

where eqn (3.14) was used. But, by the definition of virtual displacement in Section 
3.3, SG a = 0 for all a. Hence 5W (cons) = 0, as was to be proved. 

The converse proof, that 5tF (cons, = 0 implies eqn (3.14), is a bit more involved. 
As a preliminary to the proof, note that the matrix eqn (3.3) discussed in Section 
3.1 is assumed to have rank C. Since the order in which generalized coordinates are 
indexed is arbitrary, we may gain some clarity without loss of generality by assuming 
that its critical minor consists of its C rows and its last C columns, from (D — C + 1) 
to D. We then denote 16 by qGl the set of free variables q\, ..., q(D-C) and by q 
the set of bound variables q^-c+i), ■ • • . qo- The constraint conditions eqns (3.5, 3.8) 



16 The choice of bound and free variables here is not unique. In a given problem, there may be several 
critical minors of eqn (3.3) and hence several ways in which the free-bound division can be made. 
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imply 

D D-C D 

0 = SG a = Y SakSqk — Y 8akSq { k n + Y SalSqj 

k= 1 k= 1 /=D— C+l 

for a — 1 , ,C, where we have introduced the notation 



(b) 



(3.16) 



_ 3 G a (q,t) 
8ak — „ 

oqk 



(3.17) 



and have separated the sums over the free (superscript (/), index k ) and bound (su- 
perscript (b), index /) variables in the last form of the expression. 

The C x C matrix g lh) whose «/ th matrix element is defined to be 

8ai — 8a(D-c+n (3.18) 



is nonsingular by the assumption that the last C columns of eqn (3.3) are a critical 
minor of that matrix. Therefore, the inverse g {h) ~ 1 exists and may be used to solve 
eqn (3.16) for the bound virtual displacements in terms of the free ones. Thus, for 
i = 1, . . . , C, 

Sq ( D-c+i = “ E E Sia^SakSqj^ (3.19) 

a = 1 k= 1 

With this relation now assumed, eqn (3.16) becomes an identity, satisfied regardless 
of the values we choose for the Sq { ^ displacements. Thus the Sq ^ are not bound by 
the constraints and may be assigned any values, just as the name “free” suggests. 

Now form an expression by multiplying eqn (3.8) by an unknown function ). n and 
subtracting the sum over a from eqn (3.11). Since each constituent of this expression 
is zero by assumption, the expression also vanishes. Thus 



o = w (cons) - y = J2 ( e[ cons) - Y Xa s a k 

a= 1 k= 1 \ a= 1 

= E f 2[ COnS) - E X °Sak) SqjP + Y ( 0/ <cons) - E S «l b) (3 - 20) 

k= 1 \ 0=1 / /=D-C+1 V 0=1 / 

where we have once again separated the sums over free and bound variables. The last 
sum in eqn (3.20) may be written 




E ( ep“ - E w) <» = E ( - E 

l=D-C + 1 \ o=l / 1 = 1 \ 0=1 / 



D-c+i ( 3 - 21 ) 



The X a can now be chosen to be 



. _ \ " n (cons) (b)- 1 

A « — D-C+i Sia 



(3.22) 
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which makes 

Qo-c+i “ X Xa8 ai ] = 0 or equivalently f?{ cons) - ^ = 0 (3.23) 

a= 1 < 2=1 

identically for all i = l, , C, or equivalently for all / = D — C + 1, . . . , D. 

Thus the choice in eqn (3.22) makes the last sum in eqn (3.20) zero. Equation 
(3.20) then reduces to 

0 = X ( Qk° nS) - X (3.24) 

k = 1 V a = 1 / 



Now invoke the independence of the free displacements to set the 8 q'J ] nonzero one 
at a time, thus establishing that, for k = 1 D — C, 



Q 



(cons) 

k 



C 

X 1 ^ = ° 

7=1 



(3.25) 



Together with the second of eqn (3.23) for I — D — C + l, . . . , D, this establishes eqn 
(3.14) for all k values, as was to be proved. □ 



3.5 General Lagrange Equations with Constraints 

There is a wide class of idealized systems in which it can be assumed that the only 
forces acting are either constraint forces or forces derived from a potential func- 
tion. Such systems are sometimes called monogenic. For such systems, the only non- 
potential forces appearing are the constraint forces. Thus <2[ NP) = <2j, consi and the 
general Lagrange equations, eqn (2.52), become 

d_ / dL(q,q,t) \ _ dL(q.q.t) = (CO ns) f3 2Q) 

dt V 3 qk ) dq k k 

For forces of constraint that do no virtual work, Theorem 3.4.1 then allows us to 
write the Lagrange equations for the constrained motion in a form that can be solved 
without knowing the forces of constraint in advance. This result is one of the triumphs 
of the Lagrangian method. 

Theorem 3.5.1: General Lagrange Equations with Constraints 

If the only non-potential forces in a problem are the forces of constraint and if those forces 
of constraint do no virtual work, then the Lagrange equations become, for k = l, ..., D, 



d / 3 L(q, q,t)\ 

dt \ dq k J 



dL(q , q, t ) 
dqk 



C 

X^ 

a=l 



dG a (q , t) 
dqk 



(3.27) 



Together with the set of constraint equations 



Ga(q,t) = 0 ( 3 . 28 ) 

for a — 1, . . . , C, these are (D + C) equations in the ( D + C) variables q\, ... , qp, 
a | , . . . , Xc uud so may be solved for these variables. 
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Proof: Since the forces of constraint are assumed to do no virtual work, Theorem 
3.4.1 applies and eqn(3.14) may be substituted into eqn(3.26) to give the desired 
result. □ 

In applying these formulas, the partial derivatives in eqn (3.27) must be calculated 
first, and then the constraints eqn (3.28) applied to simplify the resulting differential 
equation. Note that applying the constraints before taking the partial derivatives in 
eqn (3.27) would in general lead to error. 

A second triumph of the Lagrangian method is that, not only can the problem be 
solved without knowing the forces of constraint in advance, but also the same solution 
allows one to calculate what those constraint forces must have been. 

Corollary 3.5.2: Calculation of Constraint Forces 

After the problem is solved for the equations of motion q k — q k (t) by use of Theorem 
3.5.1, one can then calculate what the forces of constraint were. 

Proof: The solution to eqns (3.27, 3.28) gives the Lagrange multipliers A.i, . . . , Xc 
as well as the coordinates q\, ... ,qr>. These X a values can then be inserted into eqn 
(3.14) to give the forces of constraint in the q- or s-systems. □ 

The general Lagrange equations, eqn (3.27), have been given in the q-system. But 
equations of exactly the same form are true in any system of generalized coordinates. 
Just replace the letter q by s or r for the s- or r-systems, respectively. 

3.6 An Alternate Notation for Holonomic Constraints 

Some texts write eqn (3.27) in an alternate notation that the reader should be aware 
of. 17 They define a new Lagrangian L that includes the constraint functions, 

c 

L(q, q, t, X) = L(q , q , t ) + ^k fl G fl (g, t ) (3.29) 

a = 1 

Then eqn (3.27) can be written in the same form as the Lagrange equations without 
constraints. It becomes 



d_ / dL(q,q,t,X) \ _ dL(q,q,t,X) = 

‘It \ dq k ) 3 qk 

Unfortunately, most of these texts do not include the X in the list of variables in 
L{q , q, t, X), which leads the reader to wonder how to take partial derivatives of the 
X a . When encountering this notation, one should mentally add X to the list of variables 
in L{q, q, t, X) so that the X a are held constant when partials with respect to q k and 
q k are taken. 

1 7 A notation similar to this one is also often used in the general calculus of variations. 
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3.7 Example of the General Method 

Let us return to the simple example of Section 2.3 but now with the constraint 

0 = Gj (s, t) = ax + Py + YZ — A (3.31) 



discussed in the introduction to the present chapter. Applying eqn (3.27) and using 
the Lagrangian in eqn (2.23) gives 



d ( dL(s,s,t)\ dL(s,s,t ) 

— ; = X\a or 

dt \ dx J dx 

d ( dL(s,s,t)\ 3 L(s,s,t) 

— —X\B or 

dt V 3 y ) dy 

dL(s,s,t)\ dL(s,s,t) 

=Xjy or 

dz ) 3 z 




mix + kx — a i a 


(3.32) 


my + ky — A.i/3 


(3.33) 


mi + kz = Liy 


(3.34) 



which, together with the constraint equation eqn (3.31) can be solved for the four 
unknowns x, y, z, 



3.8 Reduction of Degrees of Freedom 

One of the benefits of the Lagrangian method is that holonomic constraints that do 
no virtual work may be used to reduce the number of degrees of freedom (i.e., the 
number of generalized coordinates) from D to (D — C) where C is the number of 
independent constraints. After this reduction, the forces of constraint and the con- 
strained variables both disappear from the calculation, leaving Lagrange equations 
that look like those of an unconstrained system of (D — C) degrees of freedom. 

This reduction theorem is based on the idea of a reduced Lagrangian. Using the di- 
vision into free and bound variables from Theorem 3.4.1, we note that the constraint 
equations, eqn (3.1), may be written as 

0 =G a (q (f) ,q (b \t) (3.35) 

for a = 1 , . . . , C, where we have written the dependency on the free and bound 
variables separately. As proved in Theorem D.26.1 of Appendix D, the nonsingularity 
of the matrix we have called g (h] in Theorem 3.4.1 is a sufficient condition for eqn 
(3.35) to be solved for the bound variables. For l = (D — C + l), . . . , D, 

q\ b) — q\ h \q (f \t) (3.36) 

Taking the time derivative of eqn (3.36) we also obtain the generalized velocities of 
the bound variables, 

q\ b) = q\ b \q if \q if \t) (3.37) 

The reduced Lagrangian L is defined as the original Lagrangian L with eqns (3.36, 
3.37) substituted into it to eliminate the bound variables and their derivatives. Writ- 
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ing the original Lagrangian with its free and bound variables listed separately, 

L = L(q , q, t) = L(q (f \ q (b) , q (f) , q (b \ t ) (3.38) 

the reduced Lagrangian is 

l (q«\ q«\ t)=L (q«\ q (b \q ( f\ t ), q«\ q (b >(q (f >, t), t) (3.39) 

We may now state the reduction theorem. 

Theorem 3.8.1: Reduced Lagrange Equations 

If the forces of constraint do no virtual work, and if the constraints are holonomic and 
functionally independent, then the equations of motion of the system can be reduced to 



d / d L (q(f\ q(f\ t}\ dL (q ( f\ q ( f\ t) 
dt y dq k J dq k 



(3.40) 



for k = 1, .... (D — C), where the reduced Lagrangian is defined by eqn (3.39). These 
are (D — C) equations in ( D — C) unknowns and so may be solved for the fi'ee variables 
as functions of time. Thus a complete solution for the motion of the system is obtained. 

Proof: The main burden of the proof is to justify the zero on the right side of eqn 
(3.40). Constraints are present, yet there are no Lagrange multiplier expressions on 
the right like the ones we saw in eqn (3.27). 

The proof begins by a transformation to a new system of good generalized coor- 
dinates, similar to that discussed in Section 2.10. This new system, which we will call 
the special r-system, has its last C variables defined to be equal to the C constraint 
functions G a . Thus, for a — 1 , ... ,C, 

r D _ c+a = G a (q, t) = G a (<? (/) , q (b \ t ) (3.41) 

The remaining ( D — C ) variables of the r-system are set equal to free variables q { f >. 
For k — 1, ...,(£> — C), 

n = q k f) (3.42) 

This choice guarantees that the Jacobian determinant condition eqn (2.59) is satisfied 
and hence that the special r-system is a set of good generalized coordinates. For the 
second determinant in that equation will have the block form 

U 0 

( BG(q,t) \ b) = (3.43) 

V 3 q (f) ) 

where g (/;) is the matrix defined in eqns (3.17, 3.18). The determinant of this matrix 
is nonzero by the above assumption concerning the critical minor of eqn (3.3). Note 
that we denoted the (D — C) x (D — C) identity matrix by U . It will be useful also to 
label free and bound r-variables as r ^ = r\, r^o-c) an d = r^D-c+ 1)> • • • - >'d- 



dr (q, t ) 
3 q 
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Due to the definition in eqn(3.41), in the r-system each constraint function de- 
pends only on a single r ih) coordinate, 

G a {r,t) = r§_ c+a) (3.44) 

for a — 1 , ,C. Thus the general Lagrange equations, eqn (3.27), now expressed in 
the special r-system, are 



d ( dL(r, r,t)\ 9L(r, M) _ ^ , dG a (r,t)_ 

— / y " 



dt 



V 



drk 



dr k 



a = 1 



dr k 



(3.45) 



for k — 1, .... (D — C ), and 



d / dL(r, r, t)\ dL(r,i\t) 

dt \ dr/ ) dn 



V- 3G fl (r,f) 

2 , X a TT = k (l-D+C) 



a= 1 



dn 



(3.46) 



for / = (D — C + 1) D, with the equation of constraint 0 = G„ (r, t) giving 



G } = 0 



(3.47) 



for / = (D — C + 1) , . . . , D. The zero on the right side of eqn (3.45) follows from eqn 
(3.44). In the r-system, the G a (r, t) constraint functions do not depend on the free 
variables and hence dG a (r, t)/dr k = 0 for k = 1, . . . , (D — C). 

Two immediate simplifications are possible now. First, we can drop eqn (3.46). 
As we will see, it is not needed to solve the problem. Second, we can note that the 
partials in eqn (3.45) are all with respect to the free variables. Thus eqn (3.47) set- 
ting the bound variables to zero can be applied in eqn (3.45) even before the partial 
derivatives are taken. For k = 1, . . . , (D — C), we can write 



9L(r, r , t) 



dr k 



r (b)j.{b ) = o 



9 

dr k 



(L(r, r, t)\ r (b)^b ) = 0 ) 



(3.48) 



fwith a similar result for partials with respect to the free r k . If we 
Lagrangian L by 

L (r (f \ r (f \ t ^ — L(r , r, t)\ r (b)j.(b > =0 

eqn (3.45) can then be written, for k = 1, . . . , {D — C), as 

d ( dL (r^ , r^\ t)\ dL f) 

dt \ dr k ) dr k 



define the reduced 

(3.49) 

(3.50) 



But, except for the use of to denote the free variables rather than q t - > , the re- 
duced Lagrangian L(r^\ r (A> , t) in eqn (3.49) is identical 18 to the reduced Lagrangian 



18 To obtain the Lagrangian L (r, r, t), the definitions = qW and = G [q^\ q^ b \ in eqns 

(3.41, 3.42) must be inverted to give qGt = r (f) and qd’t = q r®, rj. These functions and their 

derivatives are then substituted into L(q , q, t) to get L(r, r, t). Setting r ® = 0 and = 0 in L(r, r, t), 
as is done in eqn (3.49), then gives a result that becomes identical to eqn (3.39) when labeling of the free 
variables is changed from r ( 7’ to the equivalent qG\ 
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L{q ( f\ q < - f\ t) defined in eqn (3.39). And recall that eqn (3.42) makes qt = r k for all 
k = 1, . . . , {D — C). Thus eqn (3.50) may be rewritten as 



for k — 1, . . . , (D — C), as was to be proved. □ 

3.9 Example of a Reduction 

Suppose that we have a system of one mass m moving under an acceleration of gravity 
g = —ge 3. The Lagrangian in the s-system (with si = x, S 2 = y, S 3 = z) is 

L (s, s,t) — y [x 2 + y 2 + z 2 ) - mgz (3.52) 



3 L(q(f\qU\t) 
dqk 



= 0 



(3.51) 



d I dL (q^\ t 
dt \ dqk 




Fig. 3.3. Mass m is constrained to slide without friction on the surface of a sphere of radius a. 
Gravity is assumed to be acting downward, in the negative z direction. 



Now suppose that the mass is constrained to move on the surface of a frictionless 
sphere of radius a by the constraint equation 



0 = Gi (s, t) — 



x 2 + y 2 + Z 2 ■ 



(3.53) 



Assuming that we are interested only in motions above the x-y plane, we can solve 
eqn (3.53) for z giving 



and its derivative 



z = 




— x 2 — y 2 



xx + yy 




(3.54) 

(3.55) 



We define 14 the set of free variables to be s ( ^ — x, y and the single bound vari- 
able to be = z. Substituting eqns (3.54, 3.55) into eqn (3.52) gives the reduced 



14 Note again that there are often several possible ways of making the bound-free division. Here it is 
obvious from the symmetry of the problem that any one of x, y, z could be chosen to be the bound variable. 
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Lagrangian 



L ( s 



(/) .*(/) A = - U 2 + v 2 + (Xi + yj?)2 \ - 






* + j + 2 9 2 

a- — x- — y A 



mgJ a 2 — x 2 — y 2 



(3.56) 



from which we can derive the two Lagrange equations 



i — 1 : 
i = 2: 



dt \ 3 x 

d / dL \ t ) 

dt l 3 y 



3 L(s«\sV\t) 
dx 


(3.57) 


^ = ° 

3 y 


(3.58) 



and so solve the problem. The number of degrees of freedom has been reduced from 
£> = 3to£> — C = 3- l = 2. 



3.10 Example of a Simpler Reduction Method 

In some special cases, it may be possible to choose an initial q-system that matches 
the symmetries of the constraints. Then the calculation of the reduced Lagrangian 
becomes particularly simple. 

Suppose that the initial q-system is chosen so that the equations of constraint 
depend only on the bound variables q (h) . It follows that the constraint equations 

0 = G a (q,t) = G a (q {b \t) (3.59) 

for a = 1 , . .., C, constitute C independent functions of the C variables q (h 1 and the 
time. Thus the solution for the bound variables in eqn (3.36) now gives these bound 
variables as functions of time alone, rather that as functions of the free variables and 
the time. Thus 

q\ b) = q ( t b> (t) (3.60) 

for / = (D — C + 1) , . . . , D. The derivatives q) h) — qj b \t ) may then be calculated 
from these equations, and will also be functions of time only. The calculation of the 
reduced Lagrangian is thus simplified. 

For example, the constraint of Section 3.9 has spherical symmetry. If we choose 
a system of coordinates q\ = 9, q 2 = cp, q 2 — r where r,6,(p are spherical polar 
coordinates, then the constraint equation depends only on q 2 . So we may define the 
free variables to b e^^ — 9, <j> and the single bound variable to be q (b] = r. When 
converted to this q-system, the constraint equation, eqn (3.53), becomes 

Gi (q, t) = r — a (3.61) 

which depends only on the bound variable r and so can be solved immediately for 
r — a and r — 0. 
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In spherical polar coordinates, the full Lagrangian of eqn (3.52) becomes 

L (< 7 , q, t) = y (r 2 + r 2 d 2 + r 2 sin 2 d<p 2 ^J — mgr cos 6 (3.62) 

Solving the constraint equation eqn (3.61) for r — a , r — 0 and inserting these into 
eqn (3.62) gives the simple reduced Lagrangian 

L [q^\ q^\ t) — — (a 2 0 2 + a 2 sin 2 9<p 2 ^J — mga cosd (3.63) 

from which we derive the two reduced Lagrange equations 

(3.64) 

(3.65) 

which may be used to derive the equations of motion. 



d / d L (q(f\ q(f\ t^)\ dL (q^\ q^\ t) 

dt y dO ) dd 

d / d L (q(f\ q(f\ t}\ dL (q^\ q^\ t) 

dt y 90 J 30 



3.11 Recovery of the Forces of Constraint 

We have given several methods for finding the equations of motion of the system 
without knowing the forces of constraint. Let us suppose that one of them has been 
used, and that we now have the complete solution to the problem, 

qk = qk (t) and q k = q k (t) (3.66) 



for k — 1, D. But suppose that we are curious, or otherwise need to know, the 

forces of constraint that must be acting to produce this motion. One method has al- 
ready been given, in Corollary 3.5.2. Here we will treat this problem in a more general 
way which includes solution methods that do not produce the Lagrange multipliers 
X a directly. 

Let us define A k , for k — 1, . . . , D, to be those functions of time obtained by 
putting the solution, eqn (3.66), into the left side of eqn (3.26), 



d LdL(q,q,t)\ 3 L(q,q,t) 

dt V 3 q k ) 3 q k 



q=qd) 

q=q(t) 



(3.67) 



Then eqn (3.26) gives the forces of constraint in the q-system directly as 

Q ( c°ns) = Ak ( 3 . 63 ) 

The forces of constraint in other systems can then be found from these by using the 
standard transformation formulas like eqn (3.9). One cautionary note: In evaluating 
the right side of eqn (3.67), it is essential to use the full Lagrangian, not some re- 
duced form of it. Also, all indicated partial derivatives must be taken first, before the 
solutions q = q (?) are introduced. 
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But suppose we want to find the forces of constraint in some other system, such 
as the s-system. Instead of using eqn (3.9) to convert g‘ cons) to the s-system, it is 
sometimes easier to find the Lagrange multipliers X a as an intermediate step. Then 
these same Lagrange multipliers can be used to find the forces of constraint in any 
system of coordinates by making use of equations like eqn (3.14). 



Let us define 



B a k — 



d G a (q, t) 



dqk 



q=q(t) 



(3.69) 



Then, evaluating both sides of the general Lagrange equations in eqn (3.27) using the 
known solution from eqn (3.66), gives the set of linear equations for the Lagrange 
multipliers X a , 



c 

Ak = T, ' k a Bak (3.70) 

a= 1 

where k — 1 . These equations are redundant. Since the matrix B has rank C 
by assumption, one can always select C of them to solve for the C Lagrange multi- 
pliers X a , using Cramer’s rule or some other method. The forces of constraint in, for 
example, the s-system can then be found from eqn (3.14), 

c 

„(cons) V - ' , dG a (s,t) 

F i = aZ 



(3.71) 



for i — 1 , . . . , D, where the partial derivatives on the right are evaluated using the 
known solution, eqn (3.66), now expressed in the s-system. 

Although the formal description given here for finding the may seem complex, 
in practice it is often quite simple to apply, as will be seen in the next section. 



3.12 Example of a Recovery 



As an example, imagine that we need the forces of constraint exerted by the sphere 
in Section 3.10. In this example, q\ F — 6 and qj 1 = ({> are the free variables, and 
q^’ = r is the bound variable, and there is only one constraint, C = 1. Also, that 
constraint G\(q, t) — r — a depends only on q^ b) — r. So there is only one nonzero 
matrix element, 



B 13 



dGi(q, t) 
dq3 



= 1 

q=q(t) 



(3.72) 



and eqn (3.70) for kj reduces to 



A3 = X1B13 = Xi 



( 3 . 73 ) 
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It remains to evaluate A 3 . Using the full Lagrangian from eqn (3.62) gives 
A 3 = 



cl / dL(q, cj, t)\ dL(q, q, t) 
dt l dr ) dr 



q=q(t) 

q=q(t) 



r sin 2 9<p 2 — g cos t 



d / -9 

— (mr) — m ( +r0 + 
dt V 

— m (+a9 2 + a sin 2 9<p 2 — g cos 9^j 



q=qd) 
q=q(r ) 



q=q(t) 

q=q(t) 



(3.74) 



where the 0 and f in the final expression must be evaluated using the known solution 
previously obtained. Thus 



(^+ad 2 + a sin 2 Ofr — g cos 9^j 


q=q( 0 




q=q(t) 



(3.75) 



Using this same X\, the forces of constraint in the Cartesian s-system are then 
given by eqn (3.71) in the form, for i — 1, 2, 3, 



jAcons) , 3Gi(s, t) 

r — A 1 



dsi 



= M 



s=s(t) -Jx 1 + y 2 + z 2 



s=s(t ) 



(3.76) 



where eqn (3.53) was used. In vector form, 

F (cons) = (3.77) 

which verifies the expected result that the force of constraint is entirely in the radial 
direction. 



3.13 Generalized Energy Theorem with Constraints 

The generalized energy function in a system with constraints is the same as that 
defined in Section 2.15. The generalized energy theorem is modified, however. 

Theorem 3.13.1: Generalized Energy Theorem with Constraints 

When the only non-potential forces are constraint forces that do no virtual work, the 
generalized energy theorem becomes 



Hq 



d L(q, q, t ) 
dt 



v - ^ , d G a (q, t ) 
/ X a 



(3.78) 



where H q is the same generalized energy function as was defined earlier by eqn (2.76). 
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Proof: First note that, using the standard definition eqn(2.68) for the generalized 
momenta, the general Lagrange equations of eqn (3.27) may be written in the alter- 
nate form 



Pk 



dL(q, q , t) 
dqk 



■ 2 > 

< 2=1 



dG a (q, t) 
dqk 



(3.79) 



The proof of eqn (3.78) is the same as the proof given in Theorem 2.15.1 up to and 
including eqn (2.80). The terms cancel as before, but the use of eqn (3.79) for p k 
instead of eqn (2.72) leads, after some cancellation, to the expression 



d / c 



^ = E £*< 

k= 1 \fl=l 
C / D 



dG a (q , t ) \ . 3 L(q, q, t) 

qk - 



dqk 



dt 



E , / \ dG a (q, t) . \ dL(q,q,t ) 

E TT 9k 



< 2=1 \k=l 
C 






< 2=1 



3® 7 dt 

dG a (q,t ) dG a (q, t)\ dL(q,q,t) 



dt 



dt 



dt 



(3.80) 



But eqn (3.1) implies that dG a (cy, t) /dt — 0, leading at once to eqn (3.78), as was to 
be proved. □ 

It follows from Theorem 3.13.1 that, if both the Lagrangian L(q,q,t ) and the 
constraint functions G a (q , t) in the q-system do not contain the letter t explicitly, the 
generalized energy function H q will be a constant of the motion, equal to its initial 
value at t = 0. 

The result in the s-system is similar, with a similar proof, 20 



• dL(s,s,t ) v 3 G a {s,t) 

— I / Z -<2 

< 2=1 



3r 



dt 



(3.81) 



An alternate generalized energy theorem is also possible in systems in which holo- 
nomic constraints have been used to reduce the number of degrees of freedom from 
D to (D — C). It begins with the reduced Lagrangian of eqn (3.39). Define a reduced, 
generalized energy function H q by 



(D-C) 



H r , 



- E 



, (/) 3 L{q ( f\q { f\t) 



9k 



k= 1 



dqk 



-L(q(f\q { f\t) 



Then a proof almost identical to that in Theorem 2.15.1 shows that 

dH q 3 L(qG\q < -f\t) 

dt dt 



(3.82) 



(3.83) 



20 Equation (3.81) illustrates again that the forces of constraint may do real work even when they do no 
virtual work. In the s-system, the generalized energy function H s will always equal the total energy. When 
the constraint is time varying so that 3 G a (s, t)/dt / 0, the constraint forces are seen to contribute to the 
rate of change of H s . 
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Thus, if the reduced Lagrangian L does not contain the letter t explicitly, then the 
reduced generalized energy function H q will be a constant of the motion, equal to its 
initial value at t = 0. 

3.14 Tractable Non-Holonomic Constraints 

To be treated by the Lagrangian method, constraints must at least define a definite re- 
lation between displacements of the generalized coordinates. Other things that might 
be thought of as constraints, such as inequalities like a * < qk < bk defining walls of a 
room, cannot be treated by the methods described here. 21 
However, differential constraints of the form 

D 

0 = ^2 g a k (q, t) dq k + gaO (q, t) dt (3.84) 

k= 1 

or equivalently in the s-system 



0 = ^2 fai C S, t) dsi + f a o ( s , t ) dt 



(3.85) 



i=l 



where a — 1 , ,C, and the constraints are related by 



gak — ^ ' fa 



i=\ 



dsj (q, t) 
dqk 



an d gaO — faO + ^ fa 



i = 1 



3 S{ (q, t) 
dt 



(3.86) 



can be treated even though the differential expression in eqn (3.84) is not a perfect 
differential and hence cannot be integrated to give a holonomic constraint function 
G a (q,t). These will be called tractable non-holonomic constraints. 22 

In the case of tractable but non-holonomic constraints, the allowed virtual dis- 
placements are defined to be those that satisfy an equation equivalent to eqns (eqn 
(3.5), eqn (3.8)). For a — 1 , ,C, with dG a /dqk replaced by g a k, 

D 

0 = J2gak8q k (3.87) 

k=\ 



Theorem 3.4.1 then can be generalized to say that 5W (cons) = 0 for all allowed 

21 In Lagrangian mechanics, a ball confined to a box with perfectly elastic, rigid walls would be treated 
as a series of problems. Each problem would end when the ball hits a wall, the reflection conditions would 
be applied, and the next problem would begin with the resulting initial conditions. 

“The condition for differential expression eqn (3.84) to be a perfect differential which can be integrated 
to yield a potential function like G a ( q , t) is given in Section D.20. Since each term of the homogeneous 
eqn (3.84) could be multiplied by an integrating function u a ( q , t) without changing the implied relation 
between the differentials dqk, the general condition for the integrability of eqn (3.84) for the oth constraint 
is that, for some nonzero integrating function u a ( q , f), 3 ( u a g a k ) /9<?Z = 3 d'agal) /dqk f° r every pair of 
indices k , l. Also, 3 (u a g a k) /dt = 3 (u a gaO) /dtlk must hold for every k value. If no such integrating function 
exists, then the constraint is non-holonomic. 
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virtual displacements if and only if the constraint forces have the following form 

c c 

q^cous) = x agak ( q , t) or, equivalently, F ; (cons) = ^ x a f ai (s, t ) (3.88) 

a = 1 <2=1 

If the constraint is holonomic, then g a k = dG a (q , t)/dqk and we recover eqn(3.14). 
But if the constraint is non-holonomic eqn (3.88) still applies, with the g ak taken from 
eqn(3.84). The proof of this generalization is the same as that in Section 3.4. That 
proof used only virtual displacements, and the fact that g a k was equal to dG a /dqk 
played no essential role in it. 

Thus, for the case of tractable but non-holonomic constraints, the general La- 
grange equations, eqn (3.27), become 



d / 3 L(q, q, t)\ 
dt \ dqk ) 



dL(q, q , t) 
dqk 



C 

'Y^^agak ( q , t) 
< 2=1 



with the constraint equation 



(3.89) 



D 

0=Y 8ak ( q ’ r) * + §a0 (4’ ^ (3.90) 

k= 1 

where k = 1, . . . , D and a — 1, . . . , C. These are (£> + C) differential equations for 
the ( D + C ) unknown functions q, X and therefore can be solved. Similar equations 
hold in the s-system, with f a j in place of g a k. 

The generalized energy theorem, Theorem 3.13.1, becomes 



dL(q,q,t ) 

H c, = - Y ( 3.9D 

dt * — ' 

a = 1 

or, in the s-system, 

3 L(s,s,t) -Y 

H s = ~ E ^ 3 - 92 ) 

< 2=1 

Some problems combine holonomic and non-holonomic constraints. In that case, 
the holonomic ones may be used to reduce the degrees of freedom of the system as 
outlined in Section 3.8. The non-holonomic ones may then be included by using the 
methods of the present section, but starting from the reduced Lagrangian. 



3.15 Exercises 

General note: These exercises are intended to help you master the Lagrangian theory of con- 
straints. Therefore, they must be done using those methods, even if some of them are so 
simple that elementary approaches would also be possible. 
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Fig. 3.4. Illustration for Exercise 3.1. 

Exercise 3.1 Consider the plane double pendulum from Exercise 2.11. 

(a) The sticks of the pendulum are now constrained to have fixed lengths a \ and an. Write 
the two constraint functions G i (,v, t) and G n (s , t) in terms of the s-system variables. Now 
express these same functions in terms of the q-system variables, as G’i (q , t) and Gj(q, t). 

(b) Use the full Lagrangians from Exercise 2.11, L(s, ,v, t) in the s-system and L(q, q, t ) in 
the q-system, to write all four Lagrange equations, using the Lagrange multipliers k\ and A. 2 
as appropriate. Do this in both the s- and the q-systems. 

(c) Taking 6 \ and 62 as your free variables, write the reduced Lagrangian in the reduced q- 
system L(q ( '\ qG\ t) and the two Lagrange equations for the free variables in that system. 

(d) Suppose that you are able to solve the equations in part (c) for Q\(t) and 62 (f) ■ State 
clearly, showing the exact formulas you would use, how you would calculate the Cartesian 
components of the force of constraint on each of the masses. 




Fig. 3.5. Illustration for Exercise 3.2. 



Exercise 3.2 A mass m slides on the inner surface of a conical hole in frictionless ice. The 
cone has half-angle a. Gravity acts downward, g = —ge 3 . At t— 0, the mass has spherical 
polar coordinates /-()>(), fo=0, <Pq—tc, </>o> 0. With the origin of coordinates at the vertex of 
the cone (bottom of the hole), the mass is constrained to have 6 — a. The ice is fragile. Its 
surface can only provide a normal force less than /- nlax . Find the radius 17 , at which the mass 
breaks through the ice surface. 
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z 




Fig. 3.6. Illustration for Exercises 3.3 and 3.4. The dotted rectangle represents an imaginary door 
swinging from hinges on the z-axis. The wire of the parabola is entirely in the plane of this door. 

Exercise 3.3 A rigid wire of negligible mass is bent into the shape of a parabola and sus- 
pended from the z-axis by frictionless pivots at z = ±a. The equation of the wire, in cylin- 
drical polar coordinates, is p — b(\ — z 2 /a 2 ). A bead of mass m slides without friction on 
the wire. The acceleration of gravity is g = —ge 3 . Choose cylindrical polar coordinates as 
your generalized coordinates. Assume the initial conditions at t — 0 as follows: 0 < zo < a, 
zo — 0,4>o = 0 , 0 o > 0 . 

(a) Write the full Lagrangian for this problem. 

(b) Write the three Lagrange equations, using Lagrange multipliers as appropriate. 

(c) Now, choosing your free variables to be z and 0, write the reduced Lagrangian L and the 
two Lagrange equations derived from it. 

(d) Use the result of (c) to write the reduced generalized energy H q . Is it conserved? If so 
why, if not why not? 

(e) Use the results so far obtained to write expressions for p, 0, p, 0, z, p, 0, z as functions 
of z only. [Note: These expressions, and the ones in the next part, may of course also depend 
on the initial values zo. 00 and on the parameters a, b, m, and #.] 

(f) Find the Cartesian vector force of constraint exerted by the wire on the mass for t > 0, 
expressing it as a function of z only. 

Exercise 3.4 This problem has the same geometry as Exercise 3.3, but now there is an addi- 
tional constraint: 0 = coot where wo is a given constant. 

(a) Choosing z as your free coordinate, form the reduced Lagrangian and write the single 
Lagrange equation derived from it. 

(b) Derive the reduced generalized energy function. Is it conserved? If so why, if not why 
not? [Note: This H q will not be the same as the one derived in Exercise 3.3.] 

(c) What is the smallest value of wo such that there will be at least one point (equilibrium 
point) such that if zo is set equal to that value with z 0 = 0, the mass will remain at that height 
for all time? 
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Exercise 3.5 We use cylindrical polar coordinates in this problem. A roller-coaster car of 
mass m slides without friction on a track defined by the constraints 

p = Po + tup and z = Zo — bcp (3.93) 

where a, b > 0. At f=0 the mass is at rest at 

r 0 = poei + zoe3 (3.94) 

(a) Write the full Lagrangian L(q, q, t) using a q-system consisting of cylindrical polar co- 
ordinates 4>, p, z- 

(b) Write the three Lagrange equations in the q-system, putting in A] and Xj correctly. 

(c) Now use the constraints to eliminate p, z, p, z, leaving 0 as your free variable. Write the 



z 




Fig. 3.7. Illustration for Exercise 3.5. 



reduced Lagrangian L — L(<p, 0, f). 

(d) Write the Lagrange equation using Z(0, 0, t) and solve the resulting equation for 0 as a 
function of 0 and 0 . 

(e) Write the reduced generalized energy H q based on the reduced Lagrangian Z, and use it 
to derive an equation for 0 as a function of 0 and an integration constant that you determine 
from the given initial conditions. 

(f) From parts (d) and (e) you now have 0 as a function of 0 and 0, and also 0 as a function 
of 0. Thus you effectively have both 0 and 0 as functions of 0 only. Write an expression for 
the Cartesian vector force of constraint that the track exerts on the car, writing it as a function 
only of the given parameters and the variables 0, 0, 0. [This expression, of course, could now 
be used to write the force of constraint out as a function of 0 only if you wished. But it is 
clearer just to leave the result as it is, and cite the results of parts (d) and (e) to anyone who 
wants it as a function of 0 only. (e.g. a designer who needs to know how strong to make the 
track.)] 

Exercise 3.6 Suppose a mass m \ slides without friction on a horizontal table. There is a hole 
in the center of the table. A massless string runs along the table top from m i to the hole, 
through the hole, and down below the table, where it is attached to another mass mi. The 
origin of coordinates is at the center of the hole, with the £3 axis pointing upwards. Gravity 
acts downwards. Consider the hole to have a size big enough to let the string through without 
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Fig. 3.8. Illustration for Exercise 3.6. 
friction, but small enough to be neglected in our calculations. 

(a) Using cylindrical polar coordinates for m \ and Cartesian coordinates for ho, write the full 
Lagrangian for this two-mass system. 

(b) Now apply the following constraints: Mass m \ is always at the level of the table’s surface. 
Mass m 2 is enclosed in a vertical plastic tube just large enough to hold it at X 2 = yi = 0 
while exerting no friction forces on it. The string length is Iq and never changes. With these 
constraints, write the full Lagrange equations, including the Lagrange multipliers as required. 

(c) Use the constraints to write a reduced Lagrangian with free coordinates p\ and <p \ , the 
cylindrical polar coordinates of the mass m i on the top of the table. 

(d) Write the two reduced Lagrange equations and show that the one for cp\ can be integrated 
immediately to give <p i as a function of p\ and constants determined at t = 0. Assume that 
(pi (0) > 0 at time zero. Use this result to write the other reduced Lagrange equation as an 
ordinary differential equation involving only p\ and its derivatives. 




Fig. 3.9. Illustration for Exercise 3.7. 

Exercise 3.7 Consider a single mass m to be sliding without friction on the outside surface 
of a sphere of radius a. Suppose that at time zero, it has spherical polar coordinates do > 0, 
<po = 0 and generalized velocities &o — 0 and (po > 0. 

(a) Using spherical polar coordinates, write both the full Lagrangian and the reduced La- 
grangian for this problem. 

(b) Write the Lagrange equations for both the full and the reduced Lagrangians. 

(c) Use the reduced Lagrangian L to write the reduced generalized energy H q . 

(d) The mass will leave the surface of the sphere at the instant at which the normal force 
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of constraint becomes negative (the sphere cannot pull in on the mass, only push outwards). 
Write an expression for the angle 6 max at which the mass leaves the sphere and the problem 
ends. 

(e) Define the parameter y by a <p^ = yg, where g is the acceleration of gravity, and express 
$ max as a function of a , g, y, 6q only. In the case with y = 0, find 0 max in the limit 0q -* 0. 

(f) With 0q = 10°, find the numerical value of 0 max for the case in which y = 0. How big 
would y have to be in order to reduce 0 max to 45° ? 




Fig. 3.10. Figure for Exercise 3.8. 



Exercise 3.8 A bead of mass m slides without friction on a rigid wire that lies in the x-z 
plane and has the shape z = ae~ yx , where y is some given positive constant. Gravity acts 
downwards, with g = —ge 3 . 

(a) Write the full Lagrangian for this problem, and write equations for the two holonomic 
constraints, first that the mass is confined to the plane y — 0 , and second that it is confined to 
the surface z — as~ yx . 

(b) Write the three Lagrange equations, introducing the Lagrange multipliers ).\ and kj as 
appropriate. 

(c) Use the constraints to write a reduced Lagrangian L(x, x, t), with x serving as the single 
free coordinate. Derive the reduced generalized energy from this reduced Lagrangian, and 
use it to find an expression for x 2 as a function x. (Assume that the mass is released from rest 
at the point x — 0.) Also use the reduced Lagrange equation to find an expression for x as a 
function of x and x . 

(d) Write an expression for the Cartesian vector force of constraint F <tons) acting on the 

particle, expressing it as a function of x only. Check the limit of as x —> 00 . Is it 

reasonable? 



Exercise 3.9 A fixed, right circular cylinder (first cylinder) of radius a lies on its side, with 
its symmetry axis horizontal. A hollow right circular cylinder (second cylinder) of radius b 
and mass m, is free to roll without slipping on the first one. Assume that its symmetry axis 
remains aligned with that of the first cylinder. The full Lagrangian for the second cylinder’s 
motion is 



L = -rn ^/- 2 + r 2 d 2 ^j + -mb 2 <j) 2 — mgr cos 0 



(3.95) 



where r is the distance between the axes of the two cylinders, and 6 and <p are the angles 
shown in the figure. (This “full” Lagrangian is actually partially-reduced. Constraints not rel- 
evant to this exercise have already been applied.) Notice that (f> is the angle between vertical 
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Fig. 3.11. Figure for Exercise 3.9. The upper cylinder rolls without slipping on the lower one. 
and a mark inscribed on the face of the second cylinder. 

(a) Write the two constraint functions, G i expressing the constraint that the second cylinder 
is in contact with the first one, and Gi expressing the condition of rolling without slipping. 
Assume that 0 = 0 when 0 = 0. 

(b) Write a (completely) reduced Lagrangian L(9, 9, t) and use the reduced generalized en- 
ergy theorem to express 9 as a function of 9. Assume the second cylinder to be initially at 
rest, and at a very small distance to the right of 9 = 0. 

(c) Use the full Lagrangian eqn(3.95) to write the three Lagrange equations, introducing La- 
grange multipliers as appropriate. Find the generalized force of constraint Q \ cons ■* for the r 
variable and use it to find the angle 9 C at which the rolling cylinder will lose contact with the 
fixed one. 
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The power of Lagrangian mechanics has caused generations of students to wonder 
why it is necessary, or even desirable, to recast mechanics in Hamiltonian form. The 
answer, which must be taken largely on faith at this point, is that the Hamiltonian 
formulation is a much better base from which to build more advanced methods. The 
Hamilton equations have an elegant symmetry that the Lagrange equations lack. 

Another answer, not directly related to classical mechanics, is that the Hamilto- 
nian function is used to write the Schroedinger equation of quantum mechanics, as 
discussed in Section 4.7. 



4.1 Phase Space 



The differences between the Lagrange and Hamilton equations result mainly from 
the different variable sets in which they act. The Lagrangian variable set 23 is the set 

of generalized coordinates and velocities q,q = q \ qo, q\, ... ,qo whereas the 

Hamiltonian set is the set of generalized coordinates and momenta q, p = q \, . . . , qo. 



Pi,..., PD- 



The qk in the Hamiltonian set are the same as the, assumedly good, generalized 
coordinates used in Lagrangian mechanics. And the pi are the same as the generalized 
momenta that were defined in Section 2.12 as functions of the Lagrangian variables 
and the time, 



Pk = Pk(q, q, t ) — 



dL(q, q, t) 

dq k 



(4.1) 



In Hamiltonian mechanics, the coordinates and momenta in set q, p are lumped 
together and considered to be coordinates of a 2D dimensional space called phase 
space. The variables q\, ... ,qo, pi, ..., pd are referred to collectively as canonical co- 
ordinates of phase space. The qi is called the kth. canonical coordinate, and pi is called 
the L'th canonical momentum. The pair qk, pk for the same k value are called canoni- 
cal conjugates. Hamiltonian mechanics is essentially Newton’s second law translated 
from Lagrangian form into a form appropriate for this phase space. 

In order for the phase-space variables qi , . . . , qo, pi, ■ ■ ■ , pd to be an adequate set 



23 In the previous chapters, we have made a distinction between the s-system coordinates so, 

which were just re-labeled Cartesian coordinates, and the q-system coordinates q\ qo which are the 

most general good generalized coordinates. We now drop this distinction and use only the general set 
q\ qo - Of course, being general, these coordinates include the s-system as a special case. 
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of variables for mechanics, eqn (4.1) must be invertible to give inverse functions, 



qk = qk(q, P, t ) 



(4.2) 



for k — I , D, from which the Lagrangian variables q can be found. Then knowl- 
edge of the phase-space variables q\, ... , qo, pi, , pu will allow one to determine 

the Lagrangian variables q\ qu, q \, . . . , qr>, from which the position and velocity 

of each mass in the system can be found. 

By the inverse function theorem, Theorem D.24.1, the condition for such an in- 
version is the Jacobian determinant condition 



dp 
d q 



#0 



involving the determinant of a matrix defined by 



/ dp\ _ dpkiq, q, t ) _ d 2 L(q, q, t) 
\dq) kl dqi dqidq k 



(4.3) 



(4.4) 



The inversion leading to eqn (4.2) is always possible, as proved in the following 
theorem. 

Theorem 4.1.1: Inversion of Momenta 

The matrix (dp/dq) defined in eqn (4.4) is nonsingular and positive definite. It therefore 
satisfies the determinant condition in eqn (4.3), which allows p k = p k (q,q,t ) to be 
solved for q k = quiq, p, t). 

Proof: It follows from the expansion of the Lagrangian in Section 2.7 that the kl 
matrix element in eqn (4.4) is 



dp k (q, q, t) 
dqi 



D 

mu (q, t) — ^ Mj 

7 = 1 



3 s/(q, t ) 3 Sj(q, t) 

3 qk dqi 



(4.5) 



where the sj are the Cartesian components of the s-system, and the Mj are the masses 
of the point particles. 

Defining a matrix M by its matrix elements Mjj = eqn (4.5) may be written 



as 




(4.6) 



Properties 5 and 10 of Section B.ll then give the determinant of (dp/dq) as 



dp 




d s 


2 


M 




3s 


3 q 




3 q 








3 q 



M | Mi • • • M d 



(4.7) 



The nonsingularity of the matrix (ds/dq) appearing in eqn (4.6) was shown in Section 
2.4 to be the condition for the q to be a good system of generalized coordinates, 
which we are assuming here. Thus the determinant \ds/dq\ is nonzero. Since all of the 
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particle masses M, are positive quantities, it follows that \dp/dq\ j=- 0. Thus ( dp/dq ) 
is nonsingular. 

We now show that the real, symmetric matrix (dp/dq) is positive definite. If [x] ^ 
[0] is an arbitrary, non-null column vector, it follows from eqn (4.5) that 



w T df)M = E«, 



dq 






j = i 



(4.8) 



where 

J^dsj(q,t) . . ( ds\ 

yj = > y x k or, in matrix form, [y] = — [x] (4.9) 

d 1 k \dqj 

Since the matrix (ds/dq) is nonsingular by assumption, it follows from Corollary 
B.19.2 that the column vector [y] must also be non-null. 

Since all point masses M, are positive, nonzero numbers, and since at least one 
of the yj must be nonzero, the right side of eqn (4.8) must be positive and nonzero. 
Hence 

[x] T m [x] = [x? [x] > 0 (4.10) 

Using the definition in Section C.l, this implies that (dp/dq) is a positive definite 
matrix. □ 



The theorem just proved means that any physical quantity expressed in terms of 
Lagrangian variables can equally well be expressed in terms of phase-space ones by 
simple substitution. Assuming that / = / (q, q, t) is given, the same function in terms 
of phase-space variables is defined as the compound function 



f = f (q, p,t) = f ( 4 - q(q • /a 0- 1 ) (4.11) 

where eqn (4.2) has been used. 

Since the matrix mu = dpk/dqi has been proved nonsingular, the theory of lin- 
ear equations can be used to solve for the tp : explicitly. The definition of canonical 
momenta in eqn (2.69) can be written as the D linear equations, for k — 1 , , D, 

D 

^ ~2m k /(q , t)qi = p k (q , q, t) - n k (q, t) (4.12) 

1=1 



for the D unknowns ip. They can be solved by calculating the inverse m 1 of the 
nonsingular matrix m and writing 

D 

qi(q, P, t) = f Pk ~ n k(q< 01 (4.13) 

/= 1 

or, equivalently, by using Cramer’s rule from Section B.16. 
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Lagrangian methods are sometimes applied to physical systems that are not ob- 
viously derived from Newton’s laws for point masses. In those cases, the proof given 
above may not be relevant. But the inversion of eqn (4.1) may still be possible. Sys- 
tems in which eqn (4.1) can be inverted to give eqn (4.2) will be referred to as well- 
defined Lagrangian systems. 



4.2 Hamilton Equations 

The transformation from Lagrange to Hamilton equations is a Legendre transforma- 
tion, of the sort defined in Section D.30, which the reader should consult for details. 
In this transformation, the Lagrangian function L(q,q,t ) of the Lagrangian variable 
set q, q , t is to be replaced by the Hamiltonian function H(q. p, t) of the phase-space 
variable set q, p, t. Thus L -* H corresponds to / — >• g, and there is an exchange of 
variables q o- p corresponding to the exchange of y and w. The correspondences be- 
tween the present case and the general quantities defined in Section D.30 are: L^f, 
(q, t) x, q y, H g, p o w, p u. 

The first step in the Legendre transformation, as in eqn (D. 114), is to define the 
new function H, still expressed in terms of the old variables q. q, t. In the present 
case, this first step has already been done, in Section 2.15 where the generalized 
energy function was defined as a function of the Lagrangian variables, 



dL(q, q , t) 

H = H(q , q,t) = y — q k - L(q , q,t) — } p k (q, q, t)q k - L(q, q, t) (4.14) 

dq k ^ 



k= 1 



k= 1 



As noted in Section D.30, to complete the Legendre transformation it is necessary 
to write eqn (4.14) in terms of the correct variable set q \, ... , qo, pi, ... , pd- This can 
always be done. Theorem 4.1.1 proved that the equations p k — p k {q,q , t) can always 
be inverted with to give q k = q k (q, p,t). Thus, one simply substitutes this inverse 
equation into eqn (4.14) to write H(q, p,t ) as the compound function 

H = H(q, p,t) = H ( q , q(q, p, t ), t) (4.15) 



Note to the Reader: This step of writing H in terms of phase-space variables is 
essential to the Hamiltonian method. The Hamilton equations will not be true with- 
out it. To emphasize its importance, we reserve the name “Hamiltonian” for the 
expression H(q, p, t) that results after this step is taken. 

Thus, when written in terms of q, p, t, the generalized energy function H(q , q , t) 
becomes the Hamiltonian H(q, p. t). They are the same function, but written in terms 
of different variables and called by different names. 24 

24 See Section D.5 for a discussion of the physics convention for labeling the same function expressed in 
different variables. 
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Following the Legendre transformation pattern in Section D.30, the differential of 
the H in eqn (4.14) can be written as 



D 

dH = Y, ( pkdqk + qkdpk ) - dL 
k= 1 



D 



= E 



Pkdqk + qkdpk 



dL(q, q , t) 
dqk 



dq k - 



dL(q , < 7 . t) 
dqk 




dL(q,q,t ) 

dt 

dt 

(4.16) 



Assuming for now that g[ NP) = 0, the Lagrange equations, eqn (2.52), 

d /d L(q,q,t)\ 8L(q,q,t) 

dt V dq k ) dq k 



(4.17) 



can be written in a compact form using the definition of p k from eqn (4.1), 



Pk 



3 L(q. q, t) 

dqk 



where 



Pk = 



3 L(q. q, t ) 

dqk 



(4.18) 



When eqn (4.18) is substituted into eqn (4.16), the dq k terms cancel and eqn (4.16) 
becomes 

D 

dH = ^ ( q k dp k - p k dq k ) + Hdt (4.19) 

k= 1 

where the generalized energy theorem H = —dL(q, q, t)/dt from Section 2.15 has 
been used in the last term on the right. 

The differential in eqn (4.19) may now be compared to the differential of the 
function H(q, p, t ) defined in eqn (4.15), which is 






p, t ) 



dp k 



d H(q, p, t) 
dqk 



dqk 



3 H(q, p, t ) 
3r 



dt 



(4.20) 



In the Legendre transformation method, the differentials of the original variables 
dq,dq,dt are taken to be independent. Theorem D.18.5 and eqn (4.3) imply that 
set of differentials dq i, . . . , dqo,dp i, . . . , dpo, dt are also independent. Hence, using 
Lemma D.18.3, the equality of the left sides of eqns (4.19, 4.20) implies equality of 
the corresponding coefficient of each differential term, and hence 



qk 



dH(q, p , t) 
d Pk 



Pk = ~ 



dH(q, p, t) 
dqk 



H = 



3 H(q, p, t) 
dt 



(4.21) 



for k = 1 . The first two of these expressions are called the Hamilton equations. 

The Hamilton equations are two sets of coupled first-order differential equations 
for the phase-space variables q k , p k . They are very nearly symmetric in these variables. 
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Except for the minus sign, the second of eqn (4.21) is just the first one with q k and pk 
exchanged between the partial and the time derivative. 

Since the Hamilton equations have been derived from the Lagrange equations 
by a Legendre transformation, which is invertible by definition, it follows that the 
Hamilton equations hold if and only if the Lagrange equations hold. Thus both are 
equivalent to Newton’s second law. 

Second law •<==> Lagrange equations Hamilton equations 

As can be seen from the properties of Legendre transformations, the first expres- 
sion in eqn (4.21) simply restates eqn (4.2). The Lagrangian definition in eqn (4.1) 
gives pkiq, q, t) = 3 L(q, q, t)/dqk and the first Hamilton equations just give the in- 
verse relation qk(q, p, t) — 3 H(q, p, t)/dpk . 

The second Hamilton equations in eqn (4.21), pk(q, p, t ) — —3 H(q, p , t)/dqk, are 
in a sense the “real” equations of motion, analogous to the Lagrange equations pk — 
dL(q,q,t)/dq k . 

The last of eqn (4.21) equates H — dH/dt, the total time rate of change of the 
quantity H, to the partial derivative 3 H{q, p, t)/dt of the function H(q, p, t). It is the 
phase-space analog of the Lagrangian generalized energy theorem H = —dL/dt. 



4.3 An Example of the Hamilton Equations 

As an example of the transition from Lagrange to Hamilton equations of motion, 
consider the system of a single particle in a central potential from Section 2.11. Using 
qi, qi, <?3 equal to polar coordinates r,0,<p, the Lagrangian is 



L = 



L(q, q , t) — -m ^/' 2 + r 2 0 2 + r 2 sin 2 Otp 2 ^ — 



- kr - 



(4.22) 



and therefore the generalized momenta pk for k — 1, 2, 3 are given by eqn (4.1) as 

3 L{q,q,t) . dL(q,q,t ) 2 - d L(q,q,t) 2 ■ 2 n 

p r — — = mr pe = — = mr t) p 0 = t = mr sin" 6 



dr dr 

and the generalized energy function calculated from eqn (2.76) is 



H = 



H(q, q, t ) = -m ^/‘ 2 + r 2 0 2 + r 2 sin 2 40 2 ^ + —kr 



Inverting eqn (4.23) to solve for the qk gives eqn (4.2) in the form 

Pr x Pd ; P4> 

r = — 6 — — ~ <t> — — 

m mr z mr- sin 0 



(4.23) 

(4.24) 

(4.25) 



Substituting these into the generalized energy function, eqn (4.24), then gives the 
Hamiltonian as a function of the correct phase-space variables, 



H = H(q, p. t)=^ 



Pe 



P\ 



2m 2m r- 2m r 2 sin 2 ( 



1 9 

+ -kr 2 



(4.26) 
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As is always the case, the first set of Hamilton equations from eqn (4.21), qk — 
3 H(q, p, t)/dpk for k — 1, 2, 3, simply repeats eqn (4.25), 



. _ 3 H(q, p , t) _ fp ^ _ dHjq, p, t) _ j>g_ ^ _ 3 H(q, p, t ) _ p $ 

dp r m 3 Pe mr 2 3/J f /, mr 2 sin 2 0 

(4.27) 

The next Hamilton equations, pk — —8H(q, p, t)/dqk for k — 1,2,3, are the real 
equations of motion. They are 



Pr = ~ 



3 H(q, p, t) 
dr 



4 + ^v- t 

mr 3 mr 3 sin 0 



(4.28) 



Pe = ~ 



3 H{q,p,t) Pj, 



2 cos 0 



3(9 



mr 2 sin 3 6 



and 



P<t> 



dH(q, p, t) 
d(p 



= 0 (4.29) 



The last equation in eqn (4.21) is H = dH(q, p, t)/dt, which here implies that 
H = 0 and so H — 11 (0), where constant H( 0) is determined from the value of H at 
time zero. This is the Hamiltonian analog of the generalized energy theorem. Thus 




2m 



Pd , P± 

2 mr 2 2mr 2 sin 2 0 



+ -hr 2 



= H( 0) 



(4.30) 



where 



H{ 0) = 




Pei Q) 

2mr 2 (0) 



pj(V 

2mr 2 (0) sin 2 0(0) 



+ 2 kr ( 0) 



(4.31) 



and the needed values of the canonical momenta at time zero pk( 0) can be determined 
from eqn (4.23). 

As noted in Section 2.13, in this example the coordinate (p is ignorable. The last of 
eqn (4.29) implies that p ( j, = a where a = p^i 0) is some constant determined from the 
value of p ( j, at time zero. The constant value can be substituted into eqn (4.30) to give 
the generalized energy theorem in an even simpler form, with both the coordinate <p 
and its conjugate momentum p^ absent, 




2m 




2 mr 2 



a 2 1 2 

2 1 — kr 2 

2m r 2 sin 2 9 2 



= H(0) 



(4.32) 



4.4 Non-Potential and Constraint Forces 

The derivation of the Hamilton equations in Section 4.2 has assumed that all forces 
are derived from the potential U(q, t ). However, if non-potential forces are present, 
possibly including suitable constraint forces that do no virtual work, the Hamilton 
equations can be generalized easily. In the step leading to eqn (4.19) above, one sim- 
ply replaces the Lagrange equation pk — dL(q , q , t)/dqk and the generalized energy 
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theorem H — — dL(q , q, t)/dt by the more general expressions from eqns (2.52, 2.78), 



dL(q,q,t ) „(NP) 

Pk = — + Q k 

dq k 



and h = ■£ e; NP, « - 



leading to the Hamilton equations 

3 H(q,p,t) 

qk = — i 

apk 



dH(q,p,t) (NP) 

Pk = 5 + Q k 

dqk 



dH(q, p, t) 



X! QT V) qk(q, P’ 0 



(4.33) 



(4.34) 



(4.35) 



These are the general Hamilton equations in the presence of non-potential forces. 

When the non-potential forces all come from suitable constraints, then, as proved 
in Theorem 3.4.1, G[ NP) = g[ cons) — \ 7. fl 3 G a (q, t)/dq k and hence the Hamilton 

equations become 



dH(q, p, t) . 3 H(q,p,t) ^ 3 G a (q,1) 

3 p k 3 qk ^ 3 q k 

• 3 H(q,p,t) dG a (q, t) 

H = Tt — Tt — 



(4.36) 



(4.37) 



This last equation follows from the same argument as was used in the proof of Theo- 
rem 3.13.1. 



4.5 Reduced Hamiltonian 

When the forces of constraint in a Lagrangian problem do no virtual work, and the 
C constraints are holonomic and independent, Section 3.8 showed how to use the 
constraints to reduce the number of degrees of freedom of the problem from D to 
D — C. The reduced Lagrangian L(q ( f\ q^\ t ) defined there can be used to define a 
reduced generalized energy function, as was done in eqn (3.82) of Theorem 3.13.1, 

H = H(q [f) , q (f \ t) = ^ — —q[ f) - L(q (J \ q ( f \t ) 

k= 1 d v 

= Pk f) (q (f) ’ q (f) - 0<7r (/) - L ^ f) ’ <7 (/) ’ (4.38) 

k= 1 

where we have defined, for k = 1 , . . . , D — C, 



r p[ f \q (f \ 



i(/> ft = 



dL(q(f\ q^\ t) 



(4.39) 



Since all of the reduced Lagrange equations, eqn (3.40), have zeroes on their right 
hand side, the same Legendre transformation procedure used in Section 4.2 above 
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can be used to define a reduced Hamiltonian H(q^\ p ( ?\ t) and reduced Hamilton 
equations 



•(/) _ p if \t) L(f) 

q k ~ (/) Pk 

d Pk J 



3 H(q^\ p ( f\ t) 



3 q 



(/) 



dH _ 3 H(qV\ p (f \t ) 
dt 3 1 

(4.40) 



for k — 1 ,...,(£> -C). 

The constrained variables have been eliminated from the problem. The whole 
Hamiltonian procedure is just as if the original Lagrangian problem had been free of 
constraints from the start. 

However, the above derivation leading to eqn (4.40) will be correct only if the 
definition p k = p k (q^\ q ( ?\ t) from eqn (4.39) can actually be inverted to give 






(4.41) 



This inversion allows one to make the usual substitution 

H = H{q if \ p (f \ t) = H (q<J\ q (f \q (f \ p (/ \ t), f) (4.42) 



to convert the reduced generalized energy function H(q^\q^\t) to the reduced 
Hamiltonian H(q^\ p^\ t). 

Again using Theorem D.24.1, the condition for the inversion of eqn (4.39) to give 
eqn (4.41) is 



3 / 3 '/) 
3 qW 



#0 



(4.43) 



where the matrix (3 p^/dqW) is defined by 



/ 3 p (f) \ _ 3 P k n {q u \ q ( f \ t) _ d 2 L(q { f\q { f\t ) 

The following theorem proves that this inversion can always be done. 

Theorem 4.5.1: Inversion of Reduced Momenta 

The matrix ( 3 p^/dq^) defined in eqn (4.44) is positive definite and hence nonsingular. 
Thus the inversion condition eqn (4.43) is always satisfied. 

Proof: When the constraints are holonomic and functionally independent, the bound 
variables can be written as functions of the free ones as in eqn (3.36), 
q\ h ^ = q\ h) (q ( f\ t). Substituting this result and its derivatives into the expansion of 
the full Lagrangian L in Section 2.7 gives the reduced Lagrangian L in the form 



D-C D-C D-C 

Hee m k ,q { k n q, n + ^ n k q[ f) + T 0 - U 
L jfc=l 1=1 k= 1 



(4.45) 
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where, for k, l — 1 — C), 

D D D D 

mkt — mu + 21 21 SuiriijSji + 22 m kj s j! + 21 s ki m H (4.46) 

i=D-C+lj=D-C+l j=D-C + 1 i=D— C+l 



where m is the matrix proved positive definite in Theorem 4.1.1 and, with k — (D — 
C + 1)„*. . , D and i = 1, . . . , {D - C), 



s ki — 



3 q?\q (f \t) 






(/) 



(4.47) 



Putting eqn (4.45) into eqn (4.44) gives 



(d p^\ 3 2 L(q(f\ q(f\ t) 

r = 






mu 



(4.48) 



To prove m = (dp^ /dq^) positive definite, let [x] be any arbitrary, real, non- 
null column vector of dimension ( D — C). Then define another column vector [y] of 
dimension D as the compound matrix 




It follows from the positive-definiteness of m that 

[x] T m [x] = [y] T m [y] > 0 



(4.49) 



(4.50) 



which proves that m is also positive definite. It follows from Lemma C.1.1 that matrix 
m = (3 p^/dqW) is nonsingular. Hence that the inversion condition eqn (4.43) is 
satisfied. □ 



4.6 Poisson Brackets 

In Hamiltonian mechanics, all physical quantities are represented by phase-space 
functions like that in eqn (4.11). Assuming now that no constraints are present, the 
Hamilton equations, eqn (4.21), and the chain rule can be used to write the total time 
derivative of such a function / in a useful form 



/ = 



df 

dt 




3/hy- P, t) 
3 qk 



q_k + 



3 /(<?■ jh t) 
3 Pk 




df (q, p, t) 
dt 



_ / 3 f(q. Pi t) 3 H(q, p, t) _ dfjcp p, t) dH(q , p, t) \ 3 f(q, p, t) 51 

V 9® 3 Pk dpk dqk ) 3 1 

The sum in eqn (4.51) appears frequently enough to merit a special notation for it. 
It is called the Poisson bracket [/, H] of the two phase-space functions f(q, p. t) and 
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H(q, p, t), so that eqn (4.51) becomes 



/ = 



df 

dt 



= [/, m + 



dfOh P, t) 
dt 



(4.52) 



If df/dt = 0, then phase-space function / is called a constant of the motion. Equa- 
tion (4.52) thus implies that a phase-space function that is not an explicit function of 
t will be a constant of the motion if and only if it has a vanishing Poisson bracket with 
the Hamiltonian. 

The Poisson bracket [/, y] can be defined more generally, for any two phase-space 
functions / = f(q, p, t) and g = g(q , p, t), 



U,8\ 






P, t ) dg(q. p, t) dg(q , p, t) df(q, p , t) 



k= 1 



V 3<?a- 



3 Pk 



dqk 



dpk 



(4.53) 



Note that, since partial derivatives are functions of the same variable set as was the 
function differentiated, the [/, y] is itself another phase-space function. 

This definition implies some useful algebraic properties. First, by construction, the 
Poisson bracket is anti- symmetric in the exchange of the two functions so that, for 
any / and g, 

[g,f] = ~[f,g] and hence [/,/] = 0 (4.54) 

Also, when f(q, p, t ), g(q , p, t ), and h = h(q, p , t) are any phase-space functions, and 
a , f> are numbers or otherwise not functions of q. p, the following identities can be 
proved, 

[/, (ay + 0 h)] = a [/, y] + /3[f, h] (4.55) 

[/, gh] = gif , h] + [/, g]h (4.56) 

[/, [g, h]] + [h, [/, g]] + [g, [h, /]] = 0 (4.57) 

where, for example, [/, [g,h]\ denotes the Poisson bracket of function / with the 
function [y, /;] which was obtained by taking the Poisson bracket of y and h. The last 
of the three identities is called the Jacobi identity. 

The algebra of Poisson brackets closely resembles that of the commutators of op- 
erators discussed in Section 7.1. This similarity is exploited in quantum mechanics. 
One path from classical to quantum mechanics is to write Poisson bracket relations 
and then replace the phase-space functions by quantum operators, as is discussed in 
Section 12.13. 

Poisson brackets can be used to write the Hamilton equations in Poisson bracket 
form. Replacing f(q, p. t ) in eqn (4.51) by the single variables q^. pk, H in succession 
allows eqn (4.21) to be written in the form, for any k = 1 D, 



jk = [ qk . H] Pk = [pk, H] 



H = [H, H] + 



9H(q , p, t) 



3 H(q. p. t) 
dt 

(4.58) 



The following identities follow directly from the definition in eqn (4.53). If one 
puts f(q, p , t) and g(q, p, t ) equal to any single canonical coordinate or momentum, 
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then, for any choices k,l = 1, it follows that 

[ qk , qi\ = 0 [qic, Pi] = Ski [Pk, Pi] = o (4.59) 

where Ski is the Kroeneker delta function. These are called the fundamental Poisson 
brackets, and are analogous to similar operator equations in quantum mechanics. 

Poisson brackets also play a crucial role in the definition of what are called Canon- 
ical Transformations of phase-space variables. But we will defer that discussion until 
the extended Lagrangian and Hamiltonian methods, with time as a coordinate, are 
introduced in Part II of the book. 



4.7 The Schroedinger Equation 

The Hamiltonian is an essential element in the derivation of the Schroedinger equa- 
tion of quantum mechanics. We illustrate this transition from classical to quantum 
mechanics by using the example of a single particle of mass m moving in a potential 
U {x, y, z, t ). For such a system, the Lagrangian is 

L = ™(x 2 + y 2 + z 2 )-U(x,y,z,t) (4.60) 

from which we derive the Hamiltonian 

999 

pi + p~ + p: 

H = H(q,p,t) = — — - + U(x,y,z,t) (4.61) 



The generalized coordinates here are just the Cartesian coordinates of the particle, 
<7i = x, <72 = y,qi = z. 

The standard recipe for the transition to quantum mechanics is to make the sub- 
stitutions 



H -> ih — 
dt 



Px 



-ih- 

dx 



Py 



-m- 

3y 



Pz 



-ih— (4.62) 

dz 



in eqn (4.61), and then introduce a Schroedinger wave function if(x, y, x, t ) for the 
differential operators to operate on, leading to 



9 



1 



dt 



2m 



dx 



d 



ih — ty = — -ih— ) I ih— )</''+ ~ I —ih — 



dx 



1 



2m 



d 



dy 



0 

-ih— ) f 

3 y, 



2 m 



dz 



-ih-r- ~ih— ti/r + Uf 



dz 



(4.63) 



The products of operators are interpreted as repeated application, leading to second 
partial derivatives. For example, 




(4.64) 



The result is the Schroedinger equation, the fundamental equation of nonrelativistic 
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quantum mechanics. It is usually written as 

3 -h 2 2 

ih — i Jr — V~\[r + Ux[r 

dt 2m 

where the Laplacian operator V 2 is defined as 



3 2 i jr d 2 \[r 

d y 2 3 z 2 

where V is the gradient operator defined in eqn (A. 66). 



9 d 2 \jf 



= v • (VVO 



(4.65) 



(4.66) 



4.8 The Ehrenfest Theorem 

The square of the absolute value of the Schroedinger wave function serves as a prob- 
ability density in quantum theory. In one dimensional problems, for example, in the 
limit </*-> 0 the quantity V{x, t)dx is the probability that the particle will be found 
between v and x + dx, where V(x) — \jr* {x, t) ijt(x, t). Instead of predicting the actual 
values of classical variables like position and momentum, quantum theory predicts a 
most likely value called the expectation value. The recipe for finding the expectation 
value is: (1) First one forms the classical phase-space function / (x , y, z, p x , Py, P~) 
representing the physical variable. (2) One then replaces the classical q, p values by 
quantum mechanical operators. In the position basis we are using here as an exam- 
ple, the operators representing positions x,y,z are just the coordinates themselves, 
but for the momenta the substitution in eqn (4.62) must be used. (3) The expectation 
value is then 



(/> = 



/ oo poo p 
-oo J — oo J — oo 



* 3 3 3 

i// f(x, y, z, —ih — , —ih — , —ih — )\j/ dxdydz 
dx 3 y 3 z 



(4.67) 



where we assume throughout that the Schroedinger wave function is normalized to 
give a probability of one that the particle will be found at some position, 

/ oo n oo n oo 

/ / \j/*ijr dxdydz (4.68) 

-oo J — oo J — oo 

For example, the expectation of the /-component of angular momentum, L- — xp y — 
yPx, is 

f® f 00 f“ / 3 3 \ 

(L z ) — —ih I / / xjf* ( x y — )ir dxdydz (4.69) 

J —oo J —oo J —oo V dy dx J 

Quantum mechanics also predicts an RMS deviation from the expectation value. It is 
defined as 



A / = 



j j*00 POO POO 

J — oo J —oo J — oo 



, 3 3 3 

ijr* ( f(x, y, z, —ih — , —ih — , —ih — ) — (/) 
' ' 7 dx 3 y 3 z 



2 

i /r dxdydz 

(4.70) 



Quantum mechanics is what is called a cover theory for classical mechanics. This 
means that quantum mechanics is the more comprehensive theory and should predict 




84 



INTRODUCTION TO HAMILTONIAN MECHANICS 



all of the classical results obtained in this book, in the limited domain, called the clas- 
sical limit, in which classical mechanics is adequate. Roughly speaking, this classical 
limit is reached when one may (to some acceptable degree of approximation) ignore 
A / and treat the expectation value (/) as if it were the actual value of a classical 
phase-space function f(q, p). However, it is difficult to give a general prescription for 
this limit, and each case must be approached individually 

The Ehrenfest theorem shows that, with some limitations, the Hamilton equations 
of classical mechanics also hold in quantum mechanics. 

Theorem 4.8.1: Ehrenfest Theorem 

With a Hamiltonian of the general form given in eqn(4.61), the expectation values of 
position and momenta obey equations which resemble the classical Hamilton equations, 

d 1 8H(x, y, z, p x , Py, Pz)\ d ldH(x,y,z, p x , Pv, Pz)\ 

di M = [ = i S— / 

(4.71) 

where i = 1, 2, 3 and x\ = x, X 2 = y, pi — p x , etc. The expressions on the right, like 
(3 H(x, y, z, p x , p y , p z )/dx)for example, are obtained by first taking the partial deriva- 
tive of the classical Hamiltonian, then making the substitutions from eqn(4.62), and 
finally placing the resulting expression into eqn (4.67) to obtain its expectation value. 

Proof: In quantum texts, for example Chapter 6 of Shankar (1994), eqn (4.71) is 
proved to follow from the Schroedinger equation, eqn (4.65). This proof is general 
and is not restricted to the classical limit. □ 

The Ehrenfest theorem does not allow us simply to replace the classical variables 
by their expectation values. For example, in general 

/3 H(x, y, Z, Px, Py, Pz) \ ^ dH (<*> , ( y > > (z> . < Px >, < Py >, < Pz >) f/1 ^ 

\ dx 3 (x) 1 ‘ j 

and the classical limit still requires careful consideration. 

4.9 Exercises 

Exercise 4.1 This exercise is to emphasize the importance of writing the Hamiltonian in 
terms of the correct variable set q. p, t before the Hamilton equations are applied. It shows 
that partial derivatives depend not only on the variable differentiated with respect to (t here) 
but also on the list of variables to be held constant as the derivative is taken. Define 

u(t) — a + — bt 2 and v(t ) = - bt 2 + ct 5 (4.73) 

Also define 

f(u, t) = u + smcot + ct 5 and f{y, t) = a + v + sincnt (4.74) 

(a) By writing each out as a function of t, prove that the two compound functions are equal, 
flu, t) — f{y, t) for every value of t. (Note that we are following the physicist’s convention 
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of using the same letter / for both compound functions, as discussed in Section D.5.) 

(b) Calculate 3 /(«, t)/dt and 3 f(v, t)/dt and show, by considering each as a function of f, 
that they are not equal. 

Exercise 4.2 Suppose that a one-dimensional system has Lagrangian 



• ^*9.9 •• 1 99 9 

L(q, q, t) — -mqf sin - cot+mcoqiqi sin cut cos cot+ -mco qf cos cot—mgq 1 sin cut (4.75) 



(a) Find an expression for the canonical momentum p\ and solve it for q\ . 

(b) Find the generalized energy function H (q . q , t), and use the result of part (a) to write it 
in terms of the correct Hamiltonian variables to give the Hamiltonian H(q, p, t). 

(c) Write the two Hamilton equations. Verify that the one for q\ is consistent with your result 
from part (a). 

(d) Use the Hamilton equation H — dH(q, p. t)/dt to test whether or not H is conserved. 

Exercise 4.3 In part (e) of Exercise 3.1, you derived a reduced Lagrangian L{q { !\ q ( l\ t ) 
for the plane double pendulum, using the free coordinates q { ^ — 9 1 , 62 - 

(a) Find the generalized momenta conjugate to these free coordinates and invert them to solve 
for 9\ and 62 as functions of the momenta. 

(b) Write the reduced Hamiltonian H(q^\ pU), j) f or this problem. 

Exercise 4.4 A system with two degrees of freedom has a Lagrangian 

L{q,q,t) = aql + 2bqiq2 + cql + f (4.76) 



where a, b, c, f are given functions of qi, q 2 , t. 

(a) Lind the two generalized momenta pk{q,q,t) and, using Cramer’s rule or otherwise, write 
expressions for the qt as functions of q, p,t. 

(b) Write the generalized energy function H(q, q, t) and express it in terms of the proper 
phase-space coordinates q, p. t to form the Hamiltonian II (q . p, t). 

(c) Verify that the Hamilton equations for tjk simply restate your result from part (a). 

Exercise 4.5 Suppose that we have a Hamiltonian H(q, p, t) and the usual Hamilton equa- 
tions 



qk — 



dH(q , p, t) 
dpk 



Pk = 



3 //(</, p 1 1 ) 

dqk 



H = 



dH(q, p, t) 
3f 



(4.77) 



We want to make a Legendre transformation from H(q, p. t) back to L ( q , q, 1 ). Note that 
the variables being exchanged are p ** q, and that this is the inverse of the Legendre trans- 
formation we used to get H in the first place. 

(a) Write an expression for L{q, p, t) in terms of H(q, p. t) using the rules of the Legendre 
transformation as outlined in Section D.30. 

(b) Assume that the first of eqn (4.77) can be inverted to give pk — pk(q , q , t) and show how 
this can be used to write L in terms of the correct Lagrangian variables L = L(q,q,t). 

(c) Write the differential dL and use it to derive the three Lagrange equations 



Pk 



3 L(q, q , t) 
dqk 



Pk 



3 L(q, q, t) 
dqk 



H — — 



dL(q , q, t) 

3r 



(4.78) 
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Exercise 4.6 We know from Exercise 2.3 that any solution of the Lagrange equations with 
Lagrangian L (q, q, t) is also a solution of the Lagrange equations with an equivalent La- 
grangian L'(q,q,t ) where 



L' (q, q , t) = L(q, < q, t ) + 



df(q, t ) 
dt 



N 

L(q,q,t) 

k= 1 



df(q,t) 

dqk 



q_k + 



df(qj) 

dt 



(4.79) 



(a) Let the generalized momenta for these two Lagrangians be denoted 



Pk = 



3 L(q, q , t ) 
dqk 



and 



Pk 



dL'(q , (/, f) 
dqk 



(4.80) 



Write an equation for p' k as a function of pk and the partial derivatives of /. 

(b) Find the generalized energy function H’ (q, q . t) corresponding to the Lagrangian L' . 
Write it in terms of H (q , q . r), the generalized energy function corresponding to Lagrangian 
L, and partial derivatives of / as needed. 

(c) Now assume that the original generalized energy function H (q, q, t) can be converted to 
a Hamiltonian H(q, p. t) in the usual way. Use that fact to write //' as a function of variables 
q, P- 1. 

(d) Solve your expression for p in part (a) for /?/,. = Pk(q ■ p', t), and use that solution to 
write H ’ in terms of its correct Hamiltonian variables q, p', t. 



H' = H'(q, p', t ) = H’(q, p^p(q, p' , r), t ) (4.81) 



(e) Assume that the Hamilton equations for the original Hamiltonian H hold, and prove that 
the Hamilton equations for H' are also true 



qk 



3 H'(q, p r , t) 

w k 



and 



Pk = -- 



3 H'(q, p' , t) 

3 qk 



(4.82) 



Exercise 4.7 Charged particles in an electromagnetic field were treated in Section 2.17. 

(a) Show that the Hamiltonian derived from the generalized energy in eqn (2.105) is 



N 



H = H 



(e, - 5« Ch)A ( r /;- o/c) • (p ;| - <y« ch, A(r„, t)/cj 



n = 1 



2m „ 



+ ^ ch, 4>(r„,r) 



(4.83) 



where p is the canonical momentum defined in eqn (2. 104). 

(b) Show that the Hamilton equations may be written in vector form as 



3 H 

9 P„ 



3 H 

~ 3r„ 



(4.84) 



(c) Show that the first Hamilton equation simply restates eqn (2.104). 

(d) Use the quantum substitution analogous to eqn (4.62) but with the quantum operators 25 



25 Note that, when there is a difference, the quantum operators replace p and not the particle momentum p = mv. 
See also the discussion in Section 12.13. Equation (4.86) correctly describes a nonrelativistic, spinless, charged 
particle in an external electromagnetic field. See, for example, page 387 of Shankar (1994). 
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replacing components of the canonical momentum p, 



3 9 9 9 

H — * i n — p — > —ih — p -> —ih — p — > —ih — (4.85) 

3 1 -x dx -v dy -- dz 

to show that the Schroedinger equation for a single particle of mass m and charge q moving 
in a given electromagnetic field is 

94/ (-iftV - q (ch) Mr, t)/c ) • (-// iV - q (ch) A(r, t)/c) (rhl 

it i — = ' " > \ 1 1 — ILJ-y + q ( ch ^(r, t ) 4 / ( 4 . 86 ) 

3 1 2m 

Note that the V in the first of the two (— /fiV — A(r, t)/c) factors does operate on the 

A(r, t) function in the second one, as well as on 4/. 

Exercise 4.8 In Exercise 2.9 you found the generalized energy functions for a mass on a 
rotating table in two different coordinate systems, one fixed and one rotating. You should 
use the generalized energy functions and canonical momenta from your previous work as the 
starting point of the present problem. 

(a) Find the Hamiltonians II (q . p, t) and H\q' , //, t) in these two systems. 

(b) Use the Hamilton equations. 



H = 



dH(q , p, t) 
dt 



and 



H' = 



dH\q',p f ,t) 

dt 



(4.87) 



to verify your earlier result that H is not conserved but II' is. 

Exercise 4.9 Consider a system consisting of a single particle. 

(a) Using the phase-space variables q , p = x, y. z, p x , P y, P-, prove that for any phase-space 
function f(q,p), 

[f(q, P), Px] = zf [f(q, P), Py] = zf- [f(q, P ), Pz\ = zf- ( 4 . 88 ) 

dx dy dz 

(b) The orbital angular momentum of a single mass is L = r x p. Prove that 

[L z , x] — y [L z , y] = -x [L z , z] = 0 ( 4 . 89 ) 

and 

[L z , p x ] = Py [L z , p y ] = -p x [L z , p z ] = 0 ( 4 . 90 ) 
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The calculus of variations is of enormous importance, not just in analytical mechan- 
ics, but in the whole of theoretical physics. The present chapter introduces it in the 
context of the finite-dimensional configuration spaces discussed in previous chapters. 
Mastery of this relatively simple form of the theory will provide the background re- 
quired to study more advanced topics such as the variations of fields in the complex 
spaces of quantum field theory. 

To understand what a variation is, imagine a curve drawn between two given 
points in a three-dimensional Cartesian space. Such a curve is often called a path 
between these points. Now imagine a line integral along that path, integrating some 
quantity of interest to us. For example, that quantity might be simply the increment 
of distance, so that the integral would give the total length of the path. 

Now imagine several different paths between these same two end points. The 
integrals along these different paths would, in general, be different. The calculus of 
variations is concerned with the comparison of these line integrals along different 
paths. The difference between the integral along some chosen path and the integral 
of the same quantity along other paths is called the variation of that integral. 

For example, if the integrated quantity is total length, we might want to find the 
shortest distance between the two points. Just as the minimum of an ordinary func- 
tion happens at a point at which its first-order rate of change vanishes (vanishing 
first derivative), so the shortest path turns out to be the path whose length is, to first 
order, equal to the length of its near neighbors. It is called an extremum path. 26 The 
variation of the integral about that extremum path will thus vanish to first order. 

The presentation of the calculus of variations in the present chapter uses what we 
call the General Parametric Method. In it, a path is specified parametrically, by letting 
each of its coordinates be a function of some monotonically varying, but initially 
unspecified, parameter / 3 . This method contrasts with some other textbooks in which 
one of the coordinates is used as the parameter, and the other variables are made 
functions of it rather than of a general f>. The two methods are compared in detail 
in Sections 5.14 and 5.15. The General Parametric approach used here has much to 
recommend it, and the reader is urged to adopt it. 

- 6 In ordinary calculus, after finding a point where the first derivative vanishes, we must evaluate the 
second derivative to see if the point is a maximum, minimum, or point of inflection. A similar test would 
be required also in the calculus of variations. However, the first-order theory presented in this chapter is 
not capable of such a test, so we must accept the extremum determination and try to guess from context 
whether the extremum is indeed a maximum or minimum. 



88 




PATHS IN AN A-DIMENSIONAL SPACE 



89 



5.1 Paths in an fV-Dimensional Space 

We want a mathematical characterization of paths that can be generalized to spaces 
of more than three dimensions. In a three-dimensional Cartesian example we could 
imagine a path to be represented by a perspective drawing of it, but that will not be 
possible in spaces of higher dimension. So, even in the three-dimensional example, we 
will choose to represent a path by giving the three Cartesian coordinates as functions 
of a common parameter /3, as x = x (ft), y = y (/l), and z = z (ft), and picturing it 
graphically as the three, separate graphs of these three functions. 






Fig. 5.1. A path in a three-dimensional space represented by three graphs. 

This Cartesian example is now easily generalized to N -dimensional spaces. A path 
is characterized by making each of the coordinates x k of such a space be a function 
of some parameter ft that is unspecified except for the assumption that it increases 
monotonically as the represented point moves along the path. Thus, for k — 1, . . . , N, 

x k = xk(fj) (5.1) 

would be represented by N graphs, each one of a particular coordinate versus fi. 
Together these N functions and their associated graphs represent a single path in the 
Ai-dimensional space, traced out as ft advances. 

The configuration spaces of mechanics described in Section 2.1, with x k replaced 
by q k , are one example of the kind of spaces in which the calculus of variations may 
be used. Chapter 6 is devoted to these applications. But the calculus of variations is 
more general than this particular application, and may also be used to solve problems 
that have nothing to do with mechanics. 

Various paths will be given special names. First, imagine that some arbitrary path 
xk = x k (p) has been chosen at the beginning of a calculation. This will be called the 
chosen path or the unvaried path. It will be considered to run between beginning and 
ending values of parameter ft, denoted as [ J >\ and /H respectively, and to have the end 
points defined, for k — 1 N, by 

x^=Xk(P i) and x[ 2> = xkifo) (5.2) 

After defining the chosen path, now consider another path, different from it but 
passing through the same end points. This will be called the varied path. A general 
way of writing such a varied path is to introduce a single scale parameter Sa and a set 
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of N shape functions rjk(P) so that the varied path x k (P, 8a) is defined by its deviation 
from the chosen unvaried one. Thus, for k — 1, . . . , N, 

x k (P, 8a) = x k (P) + 8a rj k (P) (5.3) 

The shape functions are finite, differentiable functions of (i that are arbitrary except 
for the condition 

m(Pi) = Vk(P 2 ) = 0 (5.4) 

which ensures that the varied and unvaried paths cross at the end points. Note that 
the scale parameter 8a is a rough measure of the difference between the varied and 
unvaried paths, in that the two paths coalesce as 8a goes to zero. This scale parameter 
is the same for all coordinates x k and the same for the whole path; it is not a function 
of the index k or the parameter p. 

5.2 Variations of Coordinates 

The calculus of variations is based on comparisons of quantities evaluated on the 
varied path with the same quantities evaluated, at the same p value, on the unvaried 
path. The difference between such quantities evaluated on the two paths is called the 
variation of the quantity. For example, the coordinates themselves can be compared, 
leading to the variation 8x k (P) defined, for all k = 1, . . . , N, by 

8x k (fj) = x k (P, 8a) - Xk(P) = 8a rjk(P) (5.5) 

Note that the comparison happens at fixed ft, but that the variation 8 x k ( ft ) is itself a 
function of /I For example, eqn (5.4) shows that it vanishes at the end points, 

8xk(Pi) = 8xk(lh) = 0 (5.6) 

Another quantity to compare is the derivative of x k with respect to p. On the 
unvaried path, this derivative is 27 



-ik(P) = 



dxk(P) 

dp 



(5.7) 



The derivative on the varied path is found by differentiating eqn (5.3), taking account 
of the fact that the scale parameter 8a is not a function of p. It is 



dx k (P,8a) dxk(P) , „ dr) k 

= b 8a 

dp dp dp 



(5.8) 



or, in a simpler notation, 



Xk(P, 8a) = Xk(P) + 8a r) k (P) 



(5.9) 



“ 7 Note that throughout this chapter we will denote total derivatives with respect to ft by a dot placed 
above the differentiated quantity. 
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The variation 8x k (P) is found as the difference between eqns (5.8, 5.7), 

t- ( q\ dx k (P,8a) dx k (P) dr) k (P) s ■ lQ , 

oxk(P) = — = 8a — — — = 8a r) k (P) 

dp dp dp 

An important consequence of the definition eqn (5.10) is that 



(5.10) 



S.x k (P) = — 8x k (P ) 

dp 



(5.11) 



Note that it follows from the definitions of this section that, for each value of p, 
the values on the varied path can be thought of as values on the chosen path plus a 
variation 



x k (P,8a) = x k (P) + 8x k (P) and x k (P, 8a) = x k (P) + 8x k (P) (5.12) 




Fig. 5.2. Varied (dashed) and unvaried (solid) paths for a typical coordinate x k . 



5.3 Variations of Functions 

The variation A / of a function / = / (x, x) is defined as the difference between its 
values on the varied and unvaried paths, again taken at the same value of p, 



A / = / (x (P, 8a) , x (P, 8a))- f (x {P ) , x (P) ) (5.13) 

The difference in eqn (5.13) may be expanded using a Taylor series, giving 



A / = 



V (x (ji._h) ,x(P, h))_ ' 
dh 



h=0 



8a+ 1 ( d 2 f( x JP’ h hx(P’ h ))' 

a + 2 \ dh 2 



8a z 



h = 0 



(5.14) 
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If A / is calculated using only the first term on the right, the resulting quantity is 
called the first-order variation and is denoted 8f. Thus 



Sf = 



/ 9/ (.r (P, h ) , x (JK h ) ) 

y dh 



8a 

h=0 



(5.15) 



Such first-order variations, which are the only ones used in the present text, are suf- 
ficient to determine extremum paths, but not to determine whether those paths are 
maxima, minima, or paths of inflection. 

Note that the distinction between the variation A / and the first-order variation 8f 
was not needed in Section 5.2 when only the coordinates and their derivatives were 
being varied. The definition in eqn (5.3) contains 8a only to the first power, and hence 
A xk — 8xk with a similar result for the derivatives, A kk = 8ik- 

One may expand the first partial derivative with respect to h in eqn (5.15) using 
eqns (5.3, 5.9) and the chain rule, giving 



N 



^ / 9/ ( x , x) dxk (/3, h) 9/ (x, x) dh (f3. h ) 
= 2^ 1 — — + 



k= 1 
N 



dxk 



8h 



dx k 



dh 



8a 

i=0 



= E (VM nW)Sa+ vo^ 1 . mSa 



k= 1 
N 



dxk 



dxk 






k= 1 



dxk 



d.Xk 



(5.16) 



where it is assumed that after the partials of / are taken, they are to be evaluated on 
the unvaried path with h = 0. 



5.4 Variation of a Line Integral 

The interesting applications of the calculus of variations involve variation of line in- 
tegrals along paths. A line integral of a function / (x, x) is taken along some line 
between end values /hand /U as 

rfo 

/=/ f{x(f3),k(f3)) dj3 (5.17) 

Jp i 

When taken along the varied path, this integral I is a function of the scale parameter 
8a, and a functional of the chosen, unvaried path x(/3) and the shape function rj(fi). 
It may be defined as 

rPi 

/ (<5u, [x], M) = / /(*(/?, 8a),x(P,8a)) d$ (5.18) 

Jpi 

where the quantities in square brackets indicate functional dependence on the en- 
closed functions. 28 The line integral along the chosen or unvaried path is the same 

“ 8 A functional is a function of a function. The \x \ indicates that I (8a, [x], [^]) depends on the whole of 
the function x(fi) for all values P\ < fi < fo- 
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integral, but with Sa — 0, 



fp2 

I (0, [x], [??]) = f (x(fi) , x(P) ) dp 

Jp i 



(5.19) 



'Pi 

The variation of I is by definition the difference between these two integrals, 

AI = I (8a, [x], [)?]) - I (0, [x], [ 17 ]) 
rPi 

= / {f(x(p,8a),x(P,8a))-f{x(p),x(P))}dp 

JP\ 



Pi 

Pi 



/ 

Jpi 



Af dp 



(5.20) 



where A / is the variation defined in eqn (5.13). 

Since the scale parameter Sa does not depend on p, inserting A / from eqn (5.14) 
into eqn (5.20) gives 



Hf( 



& hf(x ( p,h),Hp,h)y 



dh 



dp [ Sot 



h=o 



|W «,}* + .(*») <5.21, 

l L Jpi V 7 / /i=0 J 



Using eqn (5.15), the first-order term in the variation A I may be written as 



SI = 



f 

JPi 



(df(x(P,h),x(P,h)Y 



dh 



Sa dp 






>■ 

Jpi 



Sf dp 



(5.22) 



Substituting eqn (5.16) for Sf gives 

f Pi N 



SI = 



•'P! *=1 



df (x,x) 
dxk 



SxkiP ) + 



3 f(x,x) 



dh 



Sh(P) dp 



(5.23) 



It will be useful to modify eqn (5.23) slightly, using eqn (5.11) to do an integration by 
parts, 



SI = 



rPi^ 

k= 1 



3/ (*,*) t , df(x,x) d 

8x k (P) H — —Sxk(P) df 



dx k 



dh dp 



-C± 

k= 1 



df (X,X) 



to, + ^ 



d ( df (x, x) 



dh 



dxm ) - - 



d f df (x, x) 



dh 



SxkiP) ^ df 

(5.24) 



The perfect-differential term may be integrated immediately to give an integrand 
evaluated at the end points. 
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Thus, the first-order variation of the line integral reduces to the expression, 




Bfjx.x) 

dh 




df (x.x) 
dxk 



df (x, x) 
3 Xk 



Sxk(f) df 

(5.25) 



5.5 Finding Extremum Paths 

The typical use of the calculus of variations is to find paths that give extremum values 
to various line integrals. By definition, an extremum path is such that it, and all nearby 
paths that cross it at the end points, produce the same value for the line integral, to 
first order in 8a. In other words, the chosen unvaried path is an extremum if the first- 
order variation vanishes, 81 = 0, in analogy to the vanishing of the first derivative at 
the extremum points of functions in ordinary calculus. 

The main theorem of the calculus of variations may now be stated. 

Theorem 5.5.1: Euler-Lagrange Theorem 

Assume a chosen unvaried path xk if), varied paths xk if. 8a) — Xkif) + Sxkif) as 
defined in Sections 5.1 and 5.2, and a line integral 

rf>2 

1=1 f(x,x)df (5.26) 

hi 

along those paths as specified in Section 5.4. 

With the variations Sxjff) assumed arbitrary except for the condition that they van- 
ish at the end points as stated in eqn (5.6), the unvaried path is an extremum path of 
this integral, with vanishing first-order variation 81 — 0, if and only if the Xk(f) of the 
unvaried path are a solution to the Euler-Lagrange differential equations 29 



d /3 f(x,x)\ 3 f(x,x) 

df \ dxk ) 3 Xk 



(5.27) 



for k — 1 , . . . , N. 

Proof: First, we assume eqn (5.27) and use eqn (5.25) to prove that 81 — 0. Since 
eqn (5.27) holds for each value of k and f, the integrand of the second term on the 
right in eqn (5.25) vanishes identically and so the integral is zero. The first term also 
vanishes due to the assumed vanishing of the variations at the end points. Thus 81 — 0 
regardless of the 8xk(f) used, as was to be proved. 

The proof that 81 = 0 implies eqn (5.27) also uses eqn (5.25). Assuming that 
81 = 0, and that the variations vanish at the end points, the first term on the right of 



29 These equations are conventionally called the Euler-Lagrange equations, presumably to distinguish 
them from the Lagrange equations of mechanics, which have virtually the same form. One of the first uses 
of extremum principles was Fermat’s Principle (see Section 5.6), but Euler made the first clear statement 
of the calculus of variations as a general computational method. 
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eqn (5.25) is zero, giving 



0 = 81 = 




df (x,x) 
dx k 



&Xk(P) dp 



(5.28) 



Since the Sxk(P) are arbitrary, they can be set nonzero one at a time. Suppose, for 
definiteness, we set all of them to zero except SxsiP). Then the sum in eqn (5.28) 
collapses to just the k = 5 term, 

\ d (df(x,x)\ df(x,x) 1 

0 = 81 = j j — ( J ^ j - J ^ [ 3x5 08) dp = J r 5 (P) 8x 5 (P) dp 

(5.29) 

where Ts (yS) stands for the quantity in the curly brackets. Now, exploiting the arbi- 
trariness of 8 x 5 (P) = r] 5 (P)Sa, choose some arbitrary value P\ < Pq < /U and define 
/ 75 (f)) to be a continuous and continuously differentiable function which is zero except 
in a small range of p values Pq — e < p < Pq + e, and non-negative within that range. 
For example, one may choose 775 (yS) = exp{ — 1/y 2 } where y = s /e 2 /(P — Pq ) 2 — 1. 
Then, using the mean value theorem of the integral calculus to collapse the integral, 
eqn (5.29) reduces to 

0= r 5 (P Q + e £ e)C 8 a (5.30) 

where 0 B is some number in the range —1 < 9 e < 1, and C > 0 is the integral of 
775 over its nonzero range. This implies that, for any nonzero e value, Tsf/lo + 9 e e ) is 
zero for some 9 e . Since the function / is assumed to be continuously differentiable, 
the function Fs (f) ) is continuous. Taking the limit as e -» 0 then gives I'ji/lo) = 
lim £ ^o T 5 (A) + OsS) = 0 , and so 



0 = T 5 03o) 



d / df(x,x)\ 
dp \ 8 x 5 ) 



df(x,x) 

dx'5 



P=P 0 



(5.31) 



But, since k = 5 and Pq were arbitrarily chosen, any values may be chosen instead 
and so eqn (5.27) must be true for any k and p values, as was to be proved. When 
Pq is one of the end values Pi or Pn, eqn (5.31) follows from its validity for interior 
values and the assumed continuity of Ts. □ 



Note that not every chosen unvaried path is an extremum path. It is quite possible 
to choose some unvaried path and define a varied path based on it, only to find that 
81 ^ 0. But if we choose an unvaried path that satisfies eqn (5.27), then we can be 
sure that it is an extremum path with 81 = 0. Such a path can always be found, 
since the Euler-Lagrange equations in eqn (5.27) are N differential equations in N 
unknowns and so in principle can be solved exactly. 



5.6 Example of an Extremum Path Calculation 

In optics, Fermat’s Principle says that light rays always travel on paths that make 
the phase transit time T an extremum. 30 Denoting the phase velocity of light by v, 

30 Calling this Fermat’s principle of least times, as is often done, is inaccurate. It is actually Fermat’s 
principle of extremum times. For example, all rays going from an object point to a focus point through a 
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the time required for a wave crest to traverse a distance ds is d T = ds/v — nds/c 
where n is the index of refraction, and c is the vacuum speed of light. The quantity 
n ds — dO is an increment of what is often called the Optical Path Length O. Thus, 
since dT = dO/c it follows that T = O/c and Fermat’s principle may be restated by 
saying that the ray paths make an extremum of the integral defining the optical path 
length, 



°=j; 



n (. x , y, z) ds 



(5.32) 



To rewrite eqn(5.32) in a form that can be treated by the calculus of variations, let 
x,y,z be functions of a monotonic parameter ft so that, denoting derivatives with 
respect to ft by a dot as usual, 



ds 




(5.33) 



The line integral to be made an extremum becomes 




With x\ = x, X 2 = y, and xj, = z and with 



(5.34) 



f (x, x) = n (x, y, z) yj x 2 + y 2 + z 2 (5.35) 

Theorem 5.5.1 says that O will be an extremum along an unvaried path that is a 
solution to the three Euler-Lagrange equations, eqn (5.27), for k — 1,2, 3, 



d ft df (x,x) \ _ df (x, x ) = Q 

dft \ dx J dx 

d_ ft 9/ (.r,i) \ _ df (x,x) _ 
dft> V 9y ) dy 

d_ ft df(x,x) \ _ df (x,x) = Q 
dp V dz ) dz 

Inserting eqn (5.35) into these equations gives the three equations 



d i n(x,y,z)x \ 
df + y 2 + z 2 ) 

d_ ( n(x,y,z)y \ 
dp y^/x 2 + y 2 + z 2 ) 

d I n(x,y,z)z \ 
dp y f'x 2 + y 2 + z 2 ) 



Jx 2 + y 2 + z 2 
y.r 2 + y 2 + z 2 
Jx 2 + y 2 + z 2 



d n(x, y, z) 
dx 

d n(x, y, z) 
dy 

d n(x, y, z) 
dz 



(5.36) 

(5.37) 

(5.38) 

(5.39) 

(5.40) 

(5.41) 



perfect lens will have the same phase transit times. 
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The three eqns (5.39 - 5.41) are equivalent to the single vector equation 



d j n ( x , y, z) dr 
dP V/i 2 + y 2 + z 2 dp 



— \Jx 2 + y 2 + z 2 Vn ( x , y, z) = 0 



(5.42) 



which can be used to determine the extremum path. 

Throughout the development so far, we have taken care not to specify the parame- 
ter p. It can be any quantity that increases monotonically along the unvaried path. But 
now, after all partial derivatives are taken and the final form of the Euler-Lagrange 
equation, eqn (5.42), has been found, we are at liberty to make a choice of p that will 
make its solution easier. One choice that is particularly appropriate to this problem is 
to choose p equal to the arc length s measured along the unvaried path starting at 
point l. 31 Then p — s implies that 



.o ds ds 

,2 + y2 + Z *=- = - = l 

on the unvaried path, so that eqn (5.42) simplifies to 



d 

ds 




Vh = 0 



(5.43) 



(5.44) 



where t = dr/ds is the tangent unit vector defined in Section A. 12. By inspection of 
this equation we can see that the path will curve in the direction of increasing index of 
refraction, giving, for example, a rough explanation of the desert mirages that occur 
when surface heat makes n smaller at the surface. 

As a byproduct of this example, we can also prove that the extremum distance 
between two points is a straight line. If we set n (x, y,z) — 1 in the above problem, 
then the optical path length becomes just the geometrical path length, or distance. It 
follows that the extremum of geometrical path length is gotten by setting n — 1 in 
eqn (5.44), giving simply 



- = 0 
ds 



(5.45) 



But, as can be seen by reference to Section A. 12, a path whose unit tangent vector is 
a constant is a straight line. 

This Fermat’s Principle example shows the utility of the General Parametric Method 
which leaves the parameter p unspecified until after all partial derivatives have been 
taken and the Euler-Lagrange equations obtained. Upon examination of eqn (5.42), 
it appeared that the choice p — s allowed it to be simplified, and recast as a rela- 
tion among Serret-Frenet vectors. In some other problem, examination of the Euler- 
Lagrange equations might suggest a different choice for p. (See, for example, Section 
5.8.) By retaining p as an unspecified monotonic parameter until the end of calcula- 
tions, one obtains the maximum flexibility in problem solution. 



3 1 It is important that this be measured along the unvaried path. The arc lengths along the varied paths 
would depend on 8a, which would violate the condition that variations compare quantities at the same p 
value. 
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5.7 Invariance and Homogeneity 

In Section 5.6, the integrand of the variational integral was the arc length d s weighted 
by a scalar function n{x,y,z). This integral was translated into general parametric 
form by writing ds = y/x 2 + y 2 + z 2 dp. 

In undergraduate texts, variational integrals are often presented in a form such as 
I — f g(x,y,z, j-, j^dx (5.46) 

J \ Cl Clfi / 

in which the integration variable is one of the coordinates, here x for example, rather 
than a general parameter ft. These integrals can always be recast into general para- 
metric form by writing dx — x dp, where x — dx/dp in the notation being used in this 
chapter. Then writing dy/dx — y/x and dz/dx = z/x gives 



I — [ g (x, y, z, < -j~, ) dx = [ g ( x, y, z, , t ) x dp — [ f(x,y,z,x,y,z 

J \ dx dx J J \ x x ) J 



with 



fix, y, z, x, y, z) — xg ( x, y, z, f 

X X 



z)dp 

(5.47) 

(5.48) 



The general parametric method can then be applied with fix, y, z, x, y, z) as the in- 
tegrand. 

Both the function in eqn (5.35) and the / in eqn (5.48) are seen to be homoge- 
neous of degree one in the set of derivatives x, y, z. 32 This homogeneity is an essential 
element of the general parametric method. The integral / is equal to some physical 
or geometrical quantity that is to be extremized. The parameter p is just a dummy 
integration variable with no physical or geometrical significance. Its replacement by 
some other monotonic parameter 6—6 (P) must not change the value of the integral 
I. Thus 



% dfi t5 ' 49) 

where xi = dxk/d6 and hence x r , — Xkidp/dd), as has been indicated in the integrand 
of the last expression on the right. This equality holds for any values of the limits 
x^ — XkiPi ) and x[ 2) — XkiPi) and for any choice of path between them. It follows 
that the integrand fix, x) must satisfy the relation 



fix,x) = f 





dd 

dp 



(5.50) 



The required invariance under a change of parameter thus implies the homogeneity 
of /, as stated in the following theorem. 



^-Homogeneous functions are defined in Section D.31. 
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Theorem 5.7.1: Homogeneity 

The value of the integral I is unchanged when parameter f is replaced by any other 
monotonic parameter 0 — 9(f) if and only if the integrand f(x,x ) is homogeneous of 
degree one in the set of derivatives x = x\, X2, . . . , x'n. 



Proof: Equation (5.50) can be written as 

/ (x, (xX)) — Xf (x, x) (5.51) 

where X — df/dO is an arbitrary nonzero number. By Theorem D.31.1, this is the 
necessary and sufficient condition for f(x, x) to be homogeneous of degree one in 
the set of derivatives x. □ 



It follows from the homogeneity of / that the Euler-Lagrange equations are also 
parameter independent. 

Theorem 5.7.2: Invariance 

If 9 = 9(f) is any monotonically vaiying parameter, then the Xk(f) are a solution to the 
Euler-Lagrange equations with parameter f, as shown in eqn (5.27), if and only if the 
Xk(9) — Xk (f(9)) are a solution to the Euler-Lagrange equations with parameter 9, 



d_ / df(x,x') \ _ 9/ (x, x r ) 

d9 y dx' k J dxk 



(5.52) 



for k — 1, . . . , N, where x' k = dxk/d9. 



Proof: From Theorem 5.5.1, and a similar theorem with f replaced by 9, the Euler- 
Lagrange equations in f and 9 hold if and only if 

0 = 8 f P ~ f (x,x) df and 0 = S f * / (x, x f ) dO (5.53) 

J P 1 J&2 

respectively. But, eqn (5.49) shows that the two integrals in eqn (5.53) are equal. Thus 
solution of the Euler-Lagrange equation in f implies the vanishing of both variations 
in eqn (5.53), which in turn implies the solution of the Euler-Lagrange equation in 9. 
The same argument holds with f and 9 interchanged. □ 

The homogeneity of the integrand f(x,x) also has the consequence that the N 
Euler-Lagrange equations are redundant; only (N — 1) of them are independent. 

Theorem 5.7.3: Redundancy 

The Euler-Lagrange equations are redundant. If some set of functions x(f) satisfies the 
Euler-Lagrange equations in eqn (5.27) for k = 0, 1, 2, . . . , (/ — 1) ,(/ + 1) , . . . N, then 
the Euler-Lagrange equation for index I is also satisfied, except possibly at points where 
xi — 0 . 
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Proof: From the Euler condition, Theorem D.31.1, the homogeneity of f (x , x ) proved 
in Theorem 5.7.1 implies that 



N 



0 = 1 U - f(x, x) 



k=\ 



dh 



(5.54) 



Differentiating this expression with respect to p and using the chain rule gives 

N 

° = E 

k= 1 

_ y- . d_ ( df(x,x) \ _ df(x, i) 

■“ k dp V 9xk ) 3 Xk 



d /3 f(x,x)\ 3 f(x,x).. 3 f(x,x).. 3 f(x,x). 

\ dxk ) 3 Xk k dxk k 3 Xk k 



(5.55) 



Thus 



xi 



d 

d/3 



/ 3 f(x,x) \ _ 3 f(x,x) 1 _ _ / 3/(x,j) \ _ 3 f(x,x) 

v dx/ ) 3 XI } “ Xk dp V 3ii / 3 Xk 



(5.56) 



from which the theorem follows. 



□ 



5.8 The Brachistochrone Problem 

The general parametric method is particularly valuable when a proposed solution to 
the Euler-Lagrange equations is written in terms of some parameter that is not itself 
one of the variables of the problem. Then we can simplify the calculations by setting 
P equal to that parameter. 

For example, the solution to the brachistochrone problem is known to be a cycloid, 
the locus of a point on the circumference of a circle that rolls without slipping on a 
line, usually written as 

x — a{9 — sind) y = a (1 — cosd) z = 0 (5.57) 

where 9 is the angle through which the circle has rolled. The Euler-Lagrange equa- 
tions for this problem will be obtained below as usual, with p not yet specified. Then, 
after all partial derivatives have been taken, p can be set equal to 9 to test whether 
or not eqn (5.57) is a solution. And, due to the redundancy noted in Theorem 5.7.3, 
only the two simplest of the three Euler-Lagrange equations will need to be tested. 

The brachistochrone problem seeks the shape of a frictionless wire stretching be- 
tween (0, 0, 0) and (x (2 \ y®, 0) such that a bead of mass m sliding on the wire in a 
uniform gravitational field g = ge 2 moves from the origin to the final point in min- 
imum time T. By methods similar to those used in Section 5.6, and using energy 
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Fig. 5.3. Mass m slides on wire from origin to point (x®, y® , 0) in minimum time. 

conservation to get the speed of the bead v, the problem reduces to finding the ex- 
tremum of the integral 

/ — / — f ds f^ 2 lx 2 + y 2 + z 2 

1 = Tj2g = ^2g / — = / / C dfi (5.58) 

J v hn V y 

With x\ — x, X 2 = y, X 3 = z, and 



fix, x) 



/ x 2 + y 2 + z 2 



the Euler-Lagrange equations, eqn (5.27) for k — 1, 2, 3, 

d /d f(x,x)\ 3 f(x,x) 

dp \ dxjc J dxk 

reduce to the three equations 



x 

Jy (x 2 + y 2 + ^ 2 ) 



Ci 



(5.59) 



(5.60) 



(5.61) 



d_ ( y \ + l j -i 2 + y 2 + P = Q z = c ^ 

y Jy (i 2 + y 2 + z 2 ) J 2 V - v Jy (i 2 + y 2 + z 2 ) 

(5.62) 



where C\ and C 2 are integration constants. 

The z-equation can be dealt with at once. Since the square root denominator is 
real and positive for the whole of the path, the second of eqn (5.62) implies that z 
can never change sign. Thus, since p is monotonic, the function z = z(P) can pass 
through both the initial and final points (both with z = 0 ) only if z = 0 for the whole 
path and C 2 = 0 . 

The other simple equation, eqn (5.61), can now be tested. Use the flexibility of the 
general parametric method to choose P = 6, where 0 is the cycloid parameter in the 
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proposed solution eqn (5.57). Then, with x = dx/dO, etc., the left side of eqn (5.61) 
reduces to 1 /-Jla which is indeed a constant, and can be used to determine C i. 

Since y = dy/dO ^ 0 except at the isolated point 9 — it, Theorem 5.7.3 shows that 
the more complicated equation, the first of eqn (5.62), does not need to be tested. It is 
satisfied automatically due to the redundancy of the Euler-Lagrange equations. Thus 
eqn (5.57) does define the extremum path when the radius a is adjusted so that the 
cycloid curve passes through the final point (x <2 \ y (2) , 0). 

5.9 Calculus of Variations with Constraints 

Suppose now that we want to find the path that makes the integral in eqn (5.17) 
an extremum, but now subject to C holonomic constraints. These constraints are 
expressed by writing a functionally independent set of C functions of x and then 
requiring that the coordinates xk for k = I ..... /V and at each /I value be such as to 
make these functions identically zero. Thus, for a — 1 , . . . , C, 

0 = G a (x) (5.63) 

Using the definitions of unvaried path, varied path, and variation developed in Sec- 
tions 5.2 and 5.3, these constraints are assumed to hold both on the unvaried path 
0 = G a (x(/3)), and on the varied path 0 = G a (x(/3, 8a)). It follows that A G a — 
G a (x(fi, 8 a)) — G a (x(i 6)) is zero. Since the scale parameter 8a is an arbitrary con- 
tinuous parameter, it follows that the first-order variations 8G a are zero also. Thus, 
for a — 1 , ,C, 

8G a = 0 (5.64) 



Theorem 5.9.1: Euler-Lagrange with Constraints 

The integral 

rfo 

1= / f(x,x)dp (5.65) 

hi 

will be an extremum, with 81 — 0 for variations that vanish at the end points but are 
otherwise arbitrary except for the constraints in eqn (5.63), if and only if there exist C 
functions X a such that, for k = 1 , . . . , N, 



d_ / df(x,x) \ _ df (x, i) _ ^ dG a (*) 

df \ 3 Xk ) 3 Xk _ ° 3 Xk 



(5.66) 



Together, eqns (5.63, 5.66) constitute N + C equations in the N + C unknowns 
vo, ■ • • , xn, Xi, ... ,Xc tmd so can be solved to find the extremum path. The functions 
X a are called 33 Lagrange multipliers. 

33 In Chapter 2, the similarly denoted values X a were related to the forces of constraint. But the theory in 
the present chapter is more general. The Lagrange multipliers appear also in problems having nothing to 
do forces or with the Lagrange equations of mechanics. 
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Proof: Using the definitions in Section 5.3 for variation of a function, the condition 
8G a — 0 in eqn (5.64) may be expressed as a set of linear equations to be satisfied by 
the 8xk, 

N 

0 = SGa — ^2 gakS-Xk (5.67) 

k= 1 

for a — 1 , ,C, where the C x N matrix g is defined by 



3 Ga (X) 
gak — „ 

OXk 



(5.68) 



The condition for the functional independence of the constraints is that matrix g 
must be of rank C and hence have a C-rowed critical minor. As was done in the proof 
of Theorem 3.4.1, we may reorder the coordinate indices so that this critical minor is 
formed from the C rows and the last C columns of g . Then, the C x C matrix g (h> 
defined by 

8aj = ga(N-C+j) (5.69) 

will be nonsingular and have an inverse g <h> ~ 1 , since its determinant is the critical 
minor and hence is nonsingular by definition. Thus eqn (5.67) may be written as 



0 = 8G a 






1=1 



j= i 



t b ) 

(N-C+j) 



(5.70) 



which breaks the expression into two sums, first over what will be called the free coor- 
dinates, — x\ , . . . and then over the bound coordinates x (b) — 

X(n-c+ i), . ■ ■ , Xff. Then eqn (5.70) can be solved for the variations of the bound co- 
ordinates in terms of the variations of the free ones, 



8x 



(b) 

(N-C+j) 



C (N-C) 



= -£ £ 

a = 1 i = 1 



8ja l gai*x\ f) 



(5.71) 



Now to the main part of the proof. First we prove that, with the constraints, the 
extremum condition 81 — 0 implies eqn (5.66). The 81 here is the same as that derived 
earlier and given in eqn (5.25). Using the assumed vanishing of the variation at the 
end points to eliminate the integrated term, the assumed condition 81 — 0 becomes 



0 = 5/ 




3/ (*,*) 

3 xk 



rPi 

8xkdfi— / r k&Xk dfi 

J ft 1=1 



(5.72) 



where the notational definition 



d_ ( 3 f(x,x) \ 

dp \ dkk J 



3/ (x,x) 
3 Xk 



(5.73) 



has been introduced. If the variations Sxk were all arbitrary and independent, as was 
assumed in Section 5.5, then eqn (5.72) would have the immediate consequence that 
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r* = 0 for all k. But eqn (5.71) shows that the bound variations are not independent. 
Writing eqn (5.72) with separate sums over free and bound variables, substituting 
eqn (5.71) to eliminate the dependent variations, and reordering some finite sums, 
gives 



r fh ^ C) « jg., C 

= 81= / ‘ E r ‘8*idP+ / (N -c +j) Sx^_c +j) dp 



0 = 81 = 



i= i 



A j= i 



= f Y r ' “ Y Y ^(N-C+j)8 

J Pl i = l V a=l 7=1 



C C 



(*)- 1 
7« 



gai I Sxj f 1 dp 



(5.74) 



With the definition 



eqn (5.74) becomes 



c 

= ^2 r (N-C+j)8ja 
7 = 1 



0 = < 5 / 




8x\ f) dp 



(5.75) 



(5.76) 



But the variations of the free variables 8x\ l] for i = \ ..... (N — C) are independent. 
The solution in eqn (5.71) satisfies the constraint equation, eqn (5.67), regardless of 
the choices of the 8x ( ( /) . Thus an argument similar to that in Section 5.5, with <5x. 
set nonzero one at a time in eqn (5.76), shows that 81 = 0 implies 

c 

r,- - y ^kgggi = 0 (5.77) 

<2 = 1 

for all i = 1, . . . , (N — C) and all values of p, which establishes eqn (5.66) for the free 
variables. 

To see that the eqn (5.66) also hold for the bound variables, write an expression 
like eqn (5.77), but for the bound indices, and substitute eqn (5.75) for X a into it. 
Thus, for all j = 1 C, 

c c 

F(N-C+j) ~ ^KgaiN-C+j) = ^(N-C+j) - Y Xag aj 
< 2=1 <2 = 1 

c c 

r- V t- (b)-\ (b) 

= r (N-C+j) - 2^ 2-u r (N-C+l)8,a 8 aj 

a = 1 1=1 
C 

= r (N-c+j) - Y v (N-c+l)8ij = o (5.78) 

Z =1 



Thus eqn (5.66) holds for all values k = 1, . . . , N, as was to be proved. 
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To prove the converse, that eqn (5.66) implies that SI — 0, note that after the 
constrained variations have been eliminated, the variation SI is equal to the right 
side of eqn (5.76). But eqn (5.66) implies that the integrand in this right side vanishes 
identically. Thus SI = 0, as was to be proved. □ 



5.10 An Example with Constraints 

Suppose that we want to find the extremum path between two points on the surface 
of a sphere. Such a path is called a geodesic. Using coordinates x\ — x, X 2 = y, and 
X 3 = z, the integral to be extremized is 



r 2 rf >2 / 

S = j ds = J yj x 2 + y 2 + z 2 dfi 

and the constraint is 

0 = G\ (x) = yj x 2 + y 2 + z 2 — a 

Using eqn (5.66) with 

/ (x, x) — yjx 2 + y 2 + z 2 

the constrained Euler-Lagrange equations for k — 1, 2, 3 are 

d Pdf (x, x)\ 3/(x,x)_ dGi(x) 

dji \ 3x J dx 1 dx 

d /3 f(x,x)\ 3 f(x,x) l 3Gi (x) 

dP V dJ / 9y ~ Al 3v 

d /3/(x,x)\ 3 f(x,x) i 3Gi (x) 

dp V dl ) 3 ~z = Al 3 z 



(5.79) 

(5.80) 

(5.81) 

(5.82) 

(5.83) 

(5.84) 



The three equations obtained by inserting eqns (5.80, 5.81) into these equations can 
be combined into a single vector equation 



d / 1 dr\ r 

dp y y/ x" + y 2 + z 2 dP J a 



(5.85) 



where the constraint was used after the partials were taken to replace yjx 2 + v 2 + z 2 



by a. 

If, as we did in Section 5.6, we now choose p to be the arc-length v measured 
along the unvaried path starting at point 1, eqn (5.85) can be simplified further to 
give 



dt 

ds 




(5.86) 



where t = dr/ds. Using Serret-Frenet methods from Section A.12, along with eqn 
(5.80) in the vector form y/r ■ r = a, eqn (5.86) can be used to prove that the geodesic 
is a great circle, the intersection of the spherical surface with a plane passing through 
the center of the sphere. 
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5.11 Reduction of Degrees of Freedom 

The method of Lagrange multipliers treated in Sections 5.9 and 5.10 has the advan- 
tage that it treats all the coordinates symmetrically, which avoids upsetting the nat- 
ural symmetry of the problem. For example, this allowed the three Euler-Lagrange 
equations to be combined neatly into one vector expression, eqn (5.86). 

However, in some problems the easiest method is simply to eliminate the con- 
strained variables at the outset. Assume that the coordinates have been relabeled as 
was done in the proof in Section 5.9, with free coordinates — x \ , ...,X(n-c) 
and bound coordinates x ^ = x^-c+i). • • • - X N- Then, by construction, | g (fo) | ^ 0 
where g (b) is the matrix defined in eqn (5.69). But, from Theorem D.26.1, this is the 
necessary and sufficient condition for the constraint equations 



0 = G a (x) 



(5.87) 



for a — 1 , . . . , C, to be solved for the bound variables in terms of the free ones, with 
the result for all / = 1 , ,C, 



c (fe) 

\N-C+j) 



c ( *> 
\N-C+j) 



(x {f) x (f) ( X V>) 

\-'l ’ • • ‘ ’ X (N-C)J ~ X (N-C+j) ) 

These equations can then be differentiated using the chain rule to obtain 



(5.88) 



■ (b) 

K (N-C+j) 



dx 



(b) 

(N-C+j) _ .(*) 
d/3 ~ X (N-C+j) 






(5.89) 



These expressions for the bound variables and their derivatives can then be sub- 
stituted into the integral in eqn (5.65) to eliminate the bound variables, giving 



I = 




(5.90) 



where 

/ (x<f>,x<f>) = f (xW, *<*>(*<'>), x&, fW(i(/),f(/))) (5.91) 

is obtained by writing / with the free and bound variables listed separately as 
f = and then substituting eqns (5.88, 5.89) for the bound 

ones. 

Now, eqn (5.90) can be taken as the start of a new problem with no constraints, 
which can be solved by the methods of Section 5. 5. 34 Thus, the extremum condition 
is just eqn (5.27) with / replaced by / and the number of variables reduced from N 



34 The redundancy of the Euler-Lagrange equations proved in Theorem 5.7.3 will still apply to this new 
problem. If constraints have reduced the number of Euler-Lagrange equations from N to N — C, then under 
the conditions of that Theorem, satisfaction of IV — C — 1 of them will imply satisfaction of the remaining 
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to (N - C ). Thus, for k = 1, . . . , (N - C), 



d_ / /(*</>, i</>) \ _ df(xW,j(f)) 
dp l 3 kk J dx/c 



(5.92) 



Whether this reduction method is superior to the Lagrange multiplier method of 
Section 5.9 depends somewhat on the choice of the original coordinates before the 
reduction is done. If original coordinates are chosen that reflect the symmetry of the 
constraints, the reduction method is often quite simple. In the next two sections we 
give two examples of the reduction method, the first with a nonoptimal choice of 
original coordinates, and the second with a better choice. 

The use of holonomic constraints to reduce the number of dimensions of an ex- 
tremum problem is quite straightforward. We have simply solved for the bound coor- 
dinates and eliminated them from the integral whose extremum path is to be found. 
This simplicity contrasts to the similar problem in Section 3.8, where we had the 
additional difficulty of accounting for the forces of constraint. 



5.12 Example of a Reduction 

Consider again the problem of finding the geodesics on a sphere of radius a, using 
the same Cartesian coordinates as in Section 5.10. Then, restricting our attention to 
paths entirely on the upper hemisphere, the constraint equation 

0 = G[ ( x ) = yjx 2 + y 2 + z 2 — a (5.93) 



can be solved for the bound variable 






= z in terms of the free ones 



,(/) _ 



and 



„(/) 



= y> 



a 2 — x 2 — y 2 



(5.94) 



Substituting this equation and its derivative into / (x, x) = j x 2 + y 2 + z 2 from eqn 
(5.81) gives 



/ (x^ , 



v 2 + y 2 + 



:,\2 



(xx + yy) 
a 2 — x 2 — y 2 



1/2 



(5.95) 



from which we obtain the two reduced Euler-Lagrange equations for the free vari- 
ables with k — 1,2 



d / 3 / ( x^\ xW) 
dp l dx 

d h /(*(/), *(/)) 
dp \ dj 



3/ (v ( ^\ x^) 
dx 

3 /(*</>, jC/>) 

dy 



(5.96) 

(5.97) 



which may be solved for the extremum path. 
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5.13 Example of a Better Reduction 

In Section 5.12, we used the constraint to eliminate the z variable. It is often much 
simpler to take the preliminary step of choosing a set of coordinates appropriate to 
the symmetries of the constraints, and then eliminate one of the variables. 

Let us return again to the problem of the extremum path on the surface of a 
sphere, but now using the coordinates x\^ = 9, = (p, and x ^ = r where r, 9, (p 

are spherical polar coordinates. These coordinates are more appropriate for the spher- 
ical constraint. Then 

S — i ds — f Jr 2 + r 2 9 2 + r 2 sin 2 9 <p 2 dp (5.98) 

J 1 

Due to the clever choice of coordinates, the constraint is reduced to a function of 
one variable only. Moreover, it can be solved to give the actual value of r, not just its 
expression in terms of the other variables, 



0=Gi(x) = r — a (5.99) 

Putting the constrained values r = a and r — 0 into 

/ ( x , x) = yj ' r 2 + r 2 9 2 + r 2 sin 2 6 <p 2 (5.100) 

gives the reduced function 

/ (x^\ p \ = -J a 2 9 2 + a 2 sin 2 9 <p 2 (5.101) 

and hence the two reduced Euler-Lagrange equations 

d fdf(xV\x^)\ df(x«\xW) 

dp y d9 ) 3 9 

d /df(x^\x^)\ 3 / (x^\ x^'j 

dp y dcp J 3</> 

which may be solved for the extremum path. 

5.14 The Coordinate Parametric Method 

In the general parametric method presented in this chapter, the integration parameter 
P in the line integrals is left unspecified until the end of the calculation, when it is 
selected to make the Euler-Lagrange equations as simple and transparent as possible. 
Some other textbooks, particularly undergraduate ones, instead choose a particular 
one of the variables as the integration parameter, and do so at the beginning of the 
calculation rather than at the end. Let us call this use of one of the coordinates as the 
integration parameter the coordinate parametric method. 



= 0 (5.102) 

= 0 (5.103) 
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Since the reader is likely to have studied the calculus of variations from those 
texts at some point, and since the traditional form of Hamilton’s Principle presented 
in Chapter 6 closely resembles the coordinate parametric method, it will be useful to 
state and prove the Euler-Lagrange equations for that method. 

Suppose that, after rearranging the coordinates if necessary, we denote the coor- 
dinate selected to be the integration parameter in the coordinate parametric method 
by x\. Derivatives of the other coordinates with respect to x\ will be denoted by x' k so 
that x k — dxk /dx\ . The integral to be extremized in the coordinate parametric method 
may then be written as 

.(2) 

I — ( m g(,x,x' m )dx\ (5.104) 

where % 2 , ■ ■ ■ ,xn are the remaining variables and xj^ = x' 2 , ...,x' N are their deriva- 
tives with respect to x\. (Here x stands for all of the variables, x\, , x,v, and xpj 
stands for all of the variables except xi.) 

Theorem 5.14.1: Coordinate Euler-Lagrange Theorem 

Assume that the variable x\ chosen to be the integration parameter of the coordinate 
parametric method varies monotonically along the unvaried path. Then the first-order 
variation of eqn (5.104) vanishes, SI = 0, for arbitrary variations of the X 2 , ,xn 
variables with fixed endpoints (and no variation of x \ itself), if and only if the unvaried 
path x k = xk{x\) is a solution to the Euler-Lagrange equations 

jfjil =0 (5105) 

dx i \ dx k J dxk 

for k — 2, . . . , N. 

Proof: The condition that x\ must vary monotonically is essential. For if x\ were 
to be constant along some region of the unvaried path while other coordinates var- 
ied, the derivatives x' k — dxk/dx\ would be infinite and the method would fail. The 
present theorem can be proved by setting f> = x\ and g — f in Theorem 5.5.1. The 
only difficulty is that f does not appear explicitly in f(x, x), whereas x\ does appear 
in g(x,Xpj). But a close inspection of the proof of Theorem 5.5.1 reveals that the 
presence of ft in / would not invalidate the theorem. □ 

The coordinate parametric method may also be used for problems with constraints. 

Theorem 5.14.2: Coordinate Method with Constraints 

Assume that the variable x\ chosen to be the integration parameter of the coordinate 
parametric method varies monotonically along the unvaried path. Suppose that the vari- 
ations are arbitrary except for the constraints, for a — 1 , ... ,C, 



G a (x) = 0 



(5.106) 



Then, again with fixed endpoints, the first-order variation of eqn (5.104) vanishes, SI — 
0, if and only if the chosen unvaried path xk = xt(x i) is a solution to the Euler-Lagrange 
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equations 

d ( dg(x, x [i]) \ _ dg (*.*(i]) = k dG a (x) 

dx i y dx' k J 3 xk _ x a dxk 



(5.107) 



for k — 2, . . . , N. 

Proof: As noted in the previous theorem, the condition that x\ must vary monotoni- 
cally is essential. To prove the present theorem, set f — x\ , f = g in Theorem 5.9.1. 
The only difficulty is that x\ appears explicitly in g{x, xj^) and G (x) but ft does not 
appear explicitly in the f(x, x) and G (x ) of Theorem 5.9.1. But examination will re- 
veal that the proof of Theorem 5.9.1 remains valid even with a explicit dependence 
of these quantities on f>. □ 

One problem with the coordinate parametric method is that we have N coordi- 
nates xi, ... ,xn but only N — 1 Euler-Lagrange equations. Compared to the general 
parametric method in which there is an Euler-Lagrange equation for each coordinate, 
the Euler-Lagrange equation involving partial derivatives with respect to x\ has got- 
ten lost. This lost equation can be recovered by what is often called the second form 
of the Euler-Lagrange equations. 

Theorem 5.14.3: Second Form of Euler-Lagrange Equations 

The lost Euler-Lagrange equation in the coordinate parametric method may be recovered 
by defining the second form h as 



N 



h = Y2 x k 

k=2 



, dg(,x,x' m ) 



dx' k 



- g(x,x[ 1 ]) 



The lost Euler-Lagrange equation is then 



dh 
dx i 



dg (x, x' U] ) 
dxi 



c 

+ 

a= 1 



dGa U) 
3xi 



(5.108) 



(5.109) 



In problems with no constraints, or in which the constrained variables have been elimi- 
nated, the last term on the right will be absent. 

Proof: The proof closely parallels the proofs of the generalized energy theorems in 
Sections 2.15 and 3.13, with the substitutions h -» H, xi -> t, and x[q -> q, and will 
not be repeated here. □ 

We note finally that any problem stated in the coordinate parametric form can be 
converted to general parametric form. Introducing the general monotonic parameter 
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ft and writing dx\ — x\dft and x'. = .ik/x i, the integral in eqn (5.104) may be written 



( 2 ) ^ ^ 

1 = j m g(x,x' m )dx\ = J g^x,{^j^jx\dp = J f (x , x) d/3 



(5.110) 



where 

f{x,x)=x\g^x,(^Y^ (5.111) 

is the integrand for use in the general parametric method. There will now be N Euler- 
Lagrange equations, one for each coordinate. The lost equation that was recovered 
by the second form in Theorem 5.14.3 will be just another of the Euler-Lagrange 
equations of the general parametric method, and the second form will no longer be 
necessary. 



5.15 Comparison of the Methods 

Perhaps the clearest way to contrast the two methods is to re-do the example in 
Section 5.6, but now using the coordinate parametric method. Selecting x to be the 
integration parameter, the integral for the optical path length becomes 



O = 




(5.112) 



where now y' — dy/dx and z! — dz/dx. In terms of the definitions in Section 5.14, 
xi = x, X[i] = y, z, and 



g(x, Xpj) = g (x, y, z, y', z!) = n (x, y, z) J 1 + y' 2 + z' 2 (5.113) 

Equation (5.105) of Theorem 5.14.1 gives the two Euler-Lagrange equations 



d_ l dg (x, y, z, /, z') 
dx l dy' 

d_ / dg (x,y,z,y',z') 
dx l 3 z! 



dg (x, y, Z, y, z!) _ 0 

dy 


(5.114) 


d g(x,y,z,y',z') _ 0 
dz 


(5.115) 



Using eqn (5.113), these two equations reduce to 



d | 


( y'n(x,y,z) ) 


1 J 


dx ' 

d | 


\ y 1 + y a + 1 ! 2 / 
( z'n(x,y,z ) j 


1 V 

1 J 


dx ' 


\ y i + y a + z’ 2 j 


1 V 



l + y r. +z/ ,^hil 

dy 



d n(x, y, z) 
dz 



= 0 



= 0 



(5.116) 

(5.117) 
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The lost third equation can be recovered by using the second form derived in Theorem 
5.14.3. It is 



dg(x,y,z,y',z') ,dg(x,y,z,y',z') , , ,, n(x,y,z) 

h — y yy + Z yy g [x, y,z,y ,z) = 



dy' 

and eqn (5.109) becomes 



dz' 



\J 1 + y a + z' 1 

(5.118) 



d ( n(x,y,z)_ \ _ j ~ dn(x,y,z) = 

dx U + y' 2 + z' 2 ) * dx 



(5.119) 



which is the lost equation, equivalent to eqn (5.39) of the general parametric method. 
Some extra work would now be required to cast these three equations into a simple 
vector form, as was done with the general parametric method in eqn (5.44) of Section 
5.6 

Many readers will have learned the calculus of variations using the coordinate 
parametric method. The present text is urging you to use the general parametric 
method instead. As you decide which method to adopt, in a particular problem or in 
your general perception of the calculus of variations, the following points should be 
considered: 



1. As seen in the example in this section, the coordinate parametric method loses 
one Euler-Lagrange equation. Since Theorem 5.7.3 proves the Euler-Lagrange 
equations redundant, the solution path can still be found. But it is not always 
obvious at the start of a problem which one of the Euler-Lagrange equations 
one wishes to lose. It may turn out that the Euler-Lagrange equation lost as a 
result of your choice of the corresponding coordinate as integration parameter 
was actually the simplest one to solve. 

2. The lost Euler-Lagrange equation in the coordinate method can always be re- 
covered using the so-called second form of the Euler-Lagrange equations. But 
this requires more calculation. It seems preferable to use the general parametric 
method in which all of the available equations are present from the start. Rather 
than recovering information, it seems best not to lose it in the first place. 

3. The coordinate method will fail when the coordinate chosen as the integration 
parameter happens to remain constant for a section of the path. But it is not 
always obvious in advance which coordinate can be trusted to vary monoton- 
ically. For example, if the problem were to find the geodesic on the surface of 
a paraboloid of revolution oriented with its symmetry axis along £ 3 , and if the 
cylindrical polar coordinate </; had been chosen as the integration parameter, 
it would be impossible to use the Euler-Lagrange equations of the coordinate 
parametric method to test whether the line (f> — constant is a geodesic (which it 
is; see Exercise 5.3). 

4. As seen in the example in this section, the premature choice of some variable 
like x as the integration parameter often destroys the symmetry of the Euler- 
Lagrange equations among the variables, and so makes it more difficult to put 
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the resulting differential equations into a simple form. The general parametric 
method, however, retains whatever symmetry the problem possesses. 

5. Keeping p undetermined until after the partial derivatives are taken and the 
Euler-Lagrange equations are written out gives one maximum flexibility in solv- 
ing the resulting differential equations. For example, we made different choices 
above: In Section 5.6, we chose p — s, the arc length along the solution path, 
and in Section 5.8, we chose P — 6, the parameter of the cycloid solution. 

5.16 Exercises 

Exercise 5.1 Use the calculus of variations to solve the brachistochrone problem. 

(a) Verify the details of the example in Section 5.8, including the derivation of eqns(5.61, 
5.62). 

(b) Carry out the demonstration that eqn (5.57) define the extremum path for the wire. 

Exercise 5.2 Use the calculus of variations to find the extremum distance between two points 
on the surface of a sphere of radius a. 

(a) First do this problem entirely in Cartesian coordinates, with Lagrange multipliers as re- 
quired. Show that the three eqns (5.82, 5.83, 5.84) really do lead to eqn (5.85). Show how the 
choice p — s transforms eqn (5.85) into eqn (5.86). Use the Serret-Frenet methods of Section 
A. 12 to prove that the extremum path is the intersection of the sphere’s surface with a plane 
passing through its center, i.e., a great circle. 

(b) Now do the problem again, but this time use spherical polar coordinates and a reduced 
/ as outlined in Section 5.13. Choosing the $3 axis to pass through the initial point of your 
extremum line, show that this line is indeed a great circle. 

(c) Are the solutions to your differential equations necessarily the minimum distances be- 
tween the two end points? Or could they be maximum distances? 

Exercise 5.3 The general parametric method may be used to find geodesics on the surface of 
a paraboloid of revolution defined in terms of cylindrical coordinates p , (p , z by the equation 
Z = ap 2 . 

(a) Set up the integral to be minimized, using cylindrical polar coordinates. 

(b) Eliminate the z variable and write the reduced integrand f(p, (p, p, <p) and the two asso- 
ciated Euler-Lagrange equations. 

(c) Consider the path: p — p (1) with (p varying, where p (1) is a constant. Show that this path 
satisfies the Euler-Lagrange equation for <p but not the one for p and hence is not a geodesic. 

(d) Explain how this result is consistent with the redundancy of the reduced Euler-Lagrange 
equations proved in Theorem 5.7.3. Why does satisfaction of the cp equation not imply satis- 
faction of the p equation? 

(e) Consider the path: (p — with p varying, where is a constant. Show that this path 
is a geodesic. 

Exercise 5.4 

(a) Using the result of Exercise 5.2, or otherwise, show that the shortest line of constant 
latitude on the surface of the Earth (horizontal line on a Mercator projection map) is generally 
not the shortest path between its two end points. 

(b) What line would be the exception to this rule? 
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Fig. 5.4. Illustration for Exercise 5.5. The cone in (a) is cut along the x-z plane and flattened out 
as shown in (b). A straight line drawn on the flattened surface becomes a curve when the cone is 
reassembled. 

Exercise 5.5 Ail inverted right-circular cone of half-angle a is placed with its apex at the 
origin of coordinates and its symmetry axis along £ 3 . 

(a) Use the calculus of variations to find the two differential equations describing the ex- 
tremum path between two general points on the surface of this cone. [Note: For example, you 
might use spherical polar coordinates with the constraint 0 = a.] 

(b) Suppose the cone to be cut along a line defined by the surface of the cone and the x-z 
plane. The cut cone is then flattened out and a straight line is drawn on the flattened surface. 
The cone is then reassembled. Use the Euler-Lagrange equations you found in part (a) to 
determine if the line you drew (now a curve, of course) is an extremum path on the surface of 
the cone. 

Exercise 5.6 A right-circular cylinder of top radius a is oriented with its symmetry axis along 
e 3 . 

(a) Use the calculus of variations and cylindrical polar coordinates to find the two differential 
equations describing the extremum path between two general points on the surface of the 
cylinder. 

(b) Choose /I equal to s, the arc length along the curve, and solve for (p and z as functions 
of s. 

(c) Suppose the cylinder surface to be cut along a line parallel to its symmetry axis, and the 
cut surface then flattened out onto a table. Draw a diagonal line on that flattened surface and 
then re-assemble the cylinder. Determine if the line you drew (now a curve, of course) is an 
extremum path on the surface of the cylinder. 

Exercise 5.7 

(a) Use the methods in Section 5.6 to show that the extremum (in this case, actually a mini- 
mum) distance between two points in a plane is a straight line. 

(b) Surfaces that can be defined by the continuous motion of a rigid line are called devel- 
opable surfaces. They have the property that, with suitable cuts, they can be flattened out 
onto a plane surface without stretching or tearing them. (The cone in Exercise 5.5 and the 
cylinder in Exercise 5.6 are examples.) Give an argument showing that all developable sur- 
faces have the property that a straight line drawn on their flattened surfaces will be a geodesic 
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on the re-assembled curved surfaces. [Hint: Imagine the surface to have a regular arrange- 
ment of atoms, with separation i in the surface that will not change when they are flattened 
or re-assembled.] 




y y 

Fig. 5.5. Illustration for Exercise 5.8, Huygens’ Isochronous Pendulum. 



Exercise 5.8 Huygens’ Isochronous Pendulum. A mass m hangs from a massless string of 
fixed length that swings between two metal sheaves bent into the shape of a cycloid whose 
formula is given in eqn(5.57), as shown. The straight part of the string is always tangent to 
the cycloid sheave at the point of last contact. 

(a) If the string has length i = 4 a, show that the path of the mass m is the same cycloid as 
eqn (5.57), but expressed in terms of displaced coordinates x = x + an and y = y — 2a. (The 
evolute of a cycloid is a cycloid.) 

(b) Determine the period of oscillation of the mass, and show that it is independent of the 
amplitude of the pendulum’s swing. 




Fig. 5.6. Illustration for Exercise 5.9. The train enters the tunnel at H 1 * 
ends at r® : (x@\ 0). 



(R®, 0, 0) and the tunnel 



Exercise 5.9 Suppose that a rail car moves without friction through a tunnel burrowed into 
the Earth. It starts from rest, and moves entirely under the influence of the nonuniform grav- 
itational field inside the Earth, assumed here to be a sphere of uniform density with gravi- 
tational potential <J> = M®G (r 2 — 3/?®) /2Rq, where G is the gravitational constant, M® 
is the mass of the Earth, and 7?® is its radius. Ignore the rotation of the Earth. Assume that 
the tunnel lies entirely in the x-y plane where the origin of coordinates is at the center of the 
Earth and ei points directly toward the point of entry. 

(a) Using the general parametric method, write the Euler-Lagrange equations for the path 
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that extremizes the transit time T from the entry point to the point (x (1> , y <2> . 0). 

(b) By choosing P = rj after the Euler-Lagrange equations are written, show that the solution 
to these equations is, for suitable choice of R, a, given by 

/ R — a 

y — (R — a) sin rj — a sin ;/ 

V a 

(5.120) 

which are the equations of a hypocycloid, the line traced out by a point on the circumference 
of a circle of radius a that is rolling without slipping along the inside of a circle of radius 
R > a. The parameter ?; here is the plane-polar angle of the center of the rolling circle. 

(c) Suppose that the far end of the extremum tunnel is back at the surface of the Earth. If D is 
the distance along the surface of the Earth between entry and exit points, what is the greatest 
depth reached by the tunnel? How long did the trip take? 

(d) Now rewrite the Euler-Lagrange equations with the choice ft = s, the arc length along 
the unvaried path. Write them as a single vector equation, using the notation of the Serret- 
Frenet theory of Section A. 12. Taking ry and rj_ to be the resolution of the radius vector r 
into vectors parallel and perpendicular to the unit tangent vector t, write dt/ds in terms of 
r_|_, Rq, x. y. z only, where Rq is the radius of the Earth. 



= {R — a) cos i] + a cos 



R — a 
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The general calculus of variations developed in Chapter 5 may be used to derive vari- 
ational principles in mechanics. Two different, but closely related, variational princi- 
ples are presented here: Hamilton’s Principle and the phase-space Hamilton’s Princi- 
ple. One acts in the space of Lagrangian variables q,q,t and the other in Hamiltonian 
phase space q, p, t. 

Some authors believe that variational principles are the foundations of physics. For 
example, the classic analytical mechanics text of Landau and Lifshitz (1976) writes 
an action function on page two, and derives the whole of mechanics from it, includ- 
ing Newton’s laws. Whether this is a fair judgement or not, it is certainly true that 
variational principles play a crucial role in quantum theory, general relativity, and 
theoretical physics in general. 



6.1 Hamilton’s Principle in Lagrangian Form 

We now revert to the mechanics notation and denote dqk/dt by cp. This is a change 
from the notation of Chapter 5 where dxk/dji was denoted by x\. 

If we identify x\ with the time t, identify x\ 1 1 with the Lagrangian generalized 
coordinates q, and restrict ourselves to cases in which no constraints are present, 
the unconstrained Euler-Lagrange equations of the coordinate parametric variational 
method in Section 5.14 become 

d_ / dg(t,q,q) \ _ dg(t,q,q) = 
dt V dqk ) dqk 



for A: = 1 , ,D. These equations are remarkably similar in form to the Lagrange 
equations of mechanics derived in Section 2.9 for the case with no constraints and all 
forces derived from a potential, 

d_ / dL(q,q,t) \ _ d L (q , q , t) = 
dt V dq k ) dq k 



That similarity underlies Hamilton’s Principle. 35 



35 This similarity is so striking that it seems surprising that Hamilton’s Principle was not stated clearly 
until the middle of the nineteenth century. One possible reason is the authority of Maupertuis, who insisted 
on theological grounds that the system trajectory must be a true minimum of some quantity, since a wise 
God would not waste means. See the discussion in Chapter 3 of Yourgrau and Mandelstam (1968). 
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Equation (6.1) is the condition for the extremum of the line integral of the coor- 
dinate parametric method, eqn (5.104). With the above substitutions, it becomes 



I — / g(t, q, q)dt 

Jt( i) 



(6.3) 



Since eqns (6.1, 6.2) differ only by the appearance of either g(t, q.q) or L (q,q, t) in 
them, this suggests that the Lagrange equations of mechanics can be derived from a 
variational principle that seeks to extremize the integral 



w< 2 > 

I — / L (q, q, t) dt 

J t m 



(6.4) 



In mechanics, this integral is called the Action Integral, or more simply, the Action. The 
Hamilton’s Principle states that the natural path of system motion makes the action 
integral an extremum. 

Theorem 6.1.1: Hamilton’s Principle 

With I defined as in eqn (6.4), and assuming variations that vanish at the endpoints, the 
first-order variation SI vanishes for arbitrary Sq % if and only if the q^ (t) of the unvaried 
path are a solution to the Lagrange equations with Q ( ^ V) = 0. Thus the extremum 
condition 81 — 0 holds if and only if, for all k — l, .... D, 



d_ / dL(q,q,t) \ _ dL (q, q, t) _ Q 

dt V 3 qk J 3 q/c 

Proof: With the substitutions listed above, the integral in eqn (5.104) becomes iden- 
tical to eqn (6.4). With those same substitutions, the Euler-Lagrange equations, eqn 
(5.105), become identical to eqn (6.5). Theorem 5.14.1 thus proves the present theo- 
rem. □ 

The path in configuration space that is a solution to the Lagrange equations is of- 
ten referred to as the classical path. This is the path of natural motion of a mechanical 
system as it responds to the forces included in the potential part of the Lagrangian. 
Thus we can say that SI — 0 for variations about a chosen unvaried path if and only 
if that chosen path is the classical path. Notice that many different unvaried paths 
could be chosen, but that the condition <57 = 0 happens only for variations about the 
classical path. Fortunately, that classical path can be found by a procedure better than 
simple trial and error. It is found by solving the Lagrange equations. 



6.2 Hamilton’s Principle with Constraints 

If we make the same substitutions as in Section 6.1, the constrained form of the 
coordinate parametric variational method derived in Theorem 5.14.2 implies a con- 
strained form of Hamilton’s Principle. 
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Theorem 6.2.1: Hamilton’s Principle with Constraints 

With I defined as in eqn (6.4), consider variations that vanish at the endpoints but are 
otherwise arbitrary, except for the C independent holonomic constraints given by eqn 
(3.1), 

0 = G a ( q , t) (6.6) 

for a — 1 , ... ,C. Then SI — 0 if and only if the q k ( t ) of the chosen path are a solution 
to the equations 



d_ ( dL(q,q,t)\ dL(q,q,t ) _ ^ dG a ( q,t ) 

dt V dq k ) dq k ~ L “ 9q k C J 

for k — 1 , , D. Equation (6.7) is the correct equation of motion of the mechanical 
system if and only if the forces of constraint do no virtual work. 

Proof: With the same substitutions as above in Section 6.1, Theorem 5.14.2 estab- 
lishes that 81 — 0 and the conditions in eqn (6.6) do imply eqn (6.7). And Theorem 
3.5.1 establishes that eqn (6.7) is the correct equation of motion if and only if the 
forces of constraint do no virtual work. □ 

The quantities X a , which are called Lagrange multipliers in the calculus of varia- 
tions, have a special interpretation in the mechanical problem. They are related to the 
forces of constraint and maybe used, as in eqn (3.14), to derive those forces. Needless 
to say, this interpretation of the X a does not apply when the calculus of variations is 
used for nonmechanical problems. 

6.3 Comments on Hamilton’s Principle 

We proved in Chapter 2 that the Lagrange equations hold if and only if each point 
mass of the mechanical system obeys Newton’s second law. Thus the Lagrange equa- 
tions are equivalent to the second law. In Theorem 6.1.1 we have proved that, when 
all forces are derived from a potential, 81 — 0 if and only if the Lagrange equations are 
satisfied. Thus, the chain of logic has established, at least when no constraint or other 
non-potential forces are present, that Hamilton’s Principle is equivalent to Newton’s 
second law, 

Second Law Lagrange Equations Hamilton's Principle 

But the equivalence of Hamilton’s Principle to Newton’s second law is established 
only for the case when all forces are derived from a potential. If constraint forces 
are present in a mechanical system, this equivalence breaks down. Then Hamilton’s 
Principle is equivalent to Newton’s second law only in the idealized case in which the 
constraint forces do no virtual work. Hamilton’s Principle always implies eqn (6.7), 
but that equation is incorrect when the constraint forces have friction and hence 
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do virtual work. If the forces of constraint do virtual work, the correct equations of 
motion would be something like 

d_ / dL(q,q,t) \ _ dL(q,q,t) = (ftict) . dG a (q, t) 

dt V 3 qk ) dqk k 3 q k 

where the Q ( k na) are the generalized forces of the friction. 

Note to the Reader: If the forces of constraint in a mechanical system happen 
to have friction, and hence do virtual work, then eqn (6.7) will not be the correct 
equation of motion of the system. However, Hamilton’s Principle in the form of 
the variational hypothesis in Theorem 6.2.1 would still imply eqn (6.7). Thus it is 
possible for an incorrect equation to be derived from a variational method. 

Variational principles give an elegant way to express the results of mechanics. But one 
must realize that the calculus of variations is just a language, and like all languages 
can be used to make both true and false statements. 

Hamilton’s Principle in Section 6.1 is analogous to the form of the calculus of 
variations called the “coordinate parametric method,” and described in Section 5.14. 
In that coordinate parametric method, some coordinate x\ is prematurely removed 
from the list of varied coordinates and is made to play the role of integration variable. 
As seen in Section 6.1, the variable t — x\ plays that role in Hamilton’s Principle. As 
a result, Lagrangian mechanics does indeed require a “second form of the Euler- 
Lagrange equations” in analogy to that discussed in Theorem 5.14.3. That second 
form is just the generalized energy theorem H = —3 L(q,q,t) /dt, which was derived 
in Section 2.15. 

But Section 5.15 argued that the general parametric method is simpler and more 
complete than the coordinate parametric method. In the general parametric method, 
the coordinate t would be restored to its proper place as a generalized coordinate and 
the generalized energy theorem (the second form) would be restored to its proper 
place as just another Lagrange equation. The problem of restoring the apparently lost 
symmetry of Lagrangian mechanics, by treating t properly as a coordinate rather than 
as a parameter, is discussed in Part II of the book. Hamilton’s Principle with time as a 
coordinate is treated in Chapter 13. 



6.4 Phase-Space Hamilton’s Principle 

As stated in Chapter 4 on the Hamilton equations, the usefulness of phase space in 
more advanced analytical mechanics depends on the equal treatment of the canonical 
coordinates and momenta. To that end, we now use the calculus of variations to derive 
a phase-space form of Hamilton’s Principle. 

To begin, the action function defined in eqn (6.4) can be rewritten as a line integral 
involving the Hamiltonian. Solving eqn (4.14) for L, and introducing phase-space 
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variables, gives 



/ = 




Ldt — 




Pk4k ~ H(q, p, t) 



dt 



(6.9) 



The first-order variation of the line integral in eqn (6.9) may now be taken, using 
the definitions of variations of a function and an integral from Sections 5.3 and 5.4. 
The result will be 



SI 




( PkSqk + qk&Pk ) 



SH(q, p, t) 



dt 



D 

= ^2 (Pk Sqk ) 
k= I 



r (2) 

f(D 




dH (q, p,t)\ 




dH (q, p, t) 
dqk 



8qk j dt 

( 6 . 10 ) 



where an integration by parts has been done. 

Before proceeding to state a phase-space Hamilton’s Principle, we must first dis- 
cuss the meaning to be given to variations 8pk of the canonical momenta in eqn 
(6.10). In Lagrangian mechanics, the generalized momenta pk — pt(q, q , t) are func- 
tions of the Lagrangian variables q,q,t and hence Spk would be calculated using eqn 
(5.16). The result would be 




dpkiq, ip t) 
dqk 



Sqk + 



dpkiq, ip t ) 
dqk 




( 6 . 11 ) 



which would make the variation Spk depend on Sqk and its time derivative and hence 
not be an independent variation. 

But we want a phase-space Hamilton’s Principle that treats the coordinates and 
momenta equally. Thus both Sqk and Spk should be treated as independent variations, 
unrelated to each other. Therefore, we temporarily forget both the equation pk — 
Pk(q,q,t ) and its inverse qk = qk(q, p,t)- (As we will see, these relations will be 
recovered at the end of the calculation.) Thus eqn (6.10) will be considered as an 
expression involving two equally unknown sets of functions q and p. Equation (6.11) 
will therefore no longer hold. The variations of g^and pk will now be defined by the 
two equations, both holding for k — 1 D, 



qk(t, 8a) — qk(t) + Sa rjk(t) (6.12) 

p k (t, Sa) = p k (t) + Sa Xk(t) (6.13) 

where the shape functions qk and Xk are considered to be arbitrary and independent 
of one another. Since we wanted both q and p to be considered simply as coordinates 
of phase space, we have applied the definition of variation of coordinates from Section 
5.2 to both q and p. Then, as in that section, the variations Sqk — Sa qk and Spk — Sa xk 
will all be arbitrary and independent. 

We may now state the phase-space form of Hamilton’s Principle. 
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Theorem 6.4.1: Phase-Space Hamilton’s Principle 

With F defined to be the integrand ofeqn (6.9), 



D 

F(q , p, q , p, t ) = Y, p k qk - H(q , p. t) (6.14) 

1=1 



the action integral 

w< 2 > w< 2 > 

1=1 Ldt = / F(q, p, q, p, t)dt (6.15) 

Jrd) J/(D 

will be an extremum, SI = 0, for variations Sq and Sp that are arbitrary except for the 
requirement that they vanish at the endpoints and t (2 \ if and only if the Hamilton 
equations 



qk = 



3 H(q, p, t) 
dp k 



Pk = - 



3 H(q, p, t) 
dqk 



(6.16) 



hold on the unvaried path. 



Proof: Since Sq k vanishes at r ( 1 1 and r (2) by assumption, the integrated term vanishes 
and eqn(6.10) becomes 



,(2) D 

si = f J2 



qk 



3 H ( q , p, t) 
dp k 



Spk 



Pk 



3 H (q, p, t) 

dqk 



3^1 dt (6.17) 



Since both Sq and Sp are now arbitrary and independent, they may be set nonzero 
one at a time. Hence SI — 0 if and only if 6.16 hold, as was to be proved. 36 □ 

Notice that the first Hamilton equation gives qk = q k ( q • p, t) as an equation of 
motion. Thus the relation between p and q is recovered. The difference between the 
Hamiltonian and Lagrangian approaches is that in Lagrangian theory the relation 
q k — qk(q, p, t ) is an identity, true both on the unvaried and on all varied paths. But 
in the phase-space form of Hamilton’s Principle, that relation is an equation of motion 
that is true only on the classical path. It is part of the definition of the classical path. 



6.5 Exercises 

Exercise 6.1 This exercise gives an alternate proof of Theorem 6.4.1, using Theorem 5.14.1. 
(a) With the substitutions jc[i] = q i, . . . , c[o,p \. . . . , po and xi — t, show that the Euler- 
Lagrange equations in eqn(5.105) become, for k = 1 D, 



d , 


( 3 F(q, p, q , p, t)\ 


3 F(q, p.q, p.t) 


dt ' 


l dqk ) 


dqk 


d , 


( 3 F(q, p, q , p, t)\ 


dF(q, p.q, p.t) 


dt ' 


{ 3 Pk ) 


3 Pk 



= 0 



= 0 



(6.18) 

(6.19) 



(b) With F given by eqn(6.14), show that these Euler-Lagrange equations imply the Hamil- 
ton equations, eqn(6.16). 



36 See the proof of the Euler-Lagrange theorem, Theorem 5.5.1, for more detail about setting arbitrary 
variations nonzero one at a time. 
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Linear vector functions of vectors, and the related dyadic notation, are important in 
the study of rigid body motion and the covariant formulations of relativistic mechan- 
ics. In this chapter we introduce these topics and present methods which we will need 
later. 

Linear vector functions of vectors have a rich structure, with up to nine indepen- 
dent parameters needed to characterize them, and vector outputs that need not even 
have the same directions as the vector inputs. The subject of linear vector operators 
merits a chapter to itself not only for its importance in analytical mechanics, but also 
because study of it will help the reader to master the operator formalism of quantum 
mechanics. 

7.1 Definition of Operators 

It seems easiest to write linear vector functions of vectors using the operator notation 
familiar from quantum mechanics, but perfectly applicable here as well. To say that 
some function maps vector A into vector B we could write B = /(A), where / denotes 
the vector function. It is easier and clearer to write instead B = FA, with operator F 
thought of as operating to the right on A and converting it into B. The linearity of F 
is expressed by defining its operation on A = aY + fi VV, where a, fi are scalars, to be 

FA = F(aY + p W) = a FY + /3FW (7. 1) 

Linearity says that the result of operating on sum A is the same as operating on each 
of its terms and then doing the sum. Also, the scalar factors a, ft may be applied either 
before or after operation with F, giving the same result in either case. For example, 
F(a Y) = aFY. 

Since operators are defined by their action on vectors, two operators are equal, 
A — 13, if and only if 

AV = BY (7.2) 

for any arbitrary vector Y. For linear operators, this condition is equivalent to requir- 
ing only that AYk — BY k for any three, non-coplanar vectors Vi, V 2 , V 3 since any 
arbitrary vector Y can be expressed as a sum of these three. 

The null operator O and the identity (or unity) operator U are defined by OY — 0 
and UY = V for any vector V, where we adopt the usual convention of denoting the 
null vector 0 by the number 0. The null operator O is also usually denoted by just the 
number 0. This notational sloppiness seems not to lead to problems in either case. 
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Thus expressions like .4 = 0 are allowed, although .4 = 3 would be nonsense unless 
intended to be an (even more sloppy) short form for A — 3 U. 

Linear operators can be added, subtracted, and multiplied by numbers. The defi- 
nition is that 

C = ceA + /3B (7.3) 

if and only if 

CY = a AY + ft BY (7.4) 

for any arbitrary vector V. It follows from the properties of vector addition that addi- 
tion of operators is commutative and associative, 

A+B=B+A and (A + B) + C = A + (B + C) (7.5) 

The multiplication of operators is defined to mean successive application. Thus 

C = AB if and only if CY = A{BY) (7.6) 

for any vector V. Operator B acts on Y first, producing another vector BY. The oper- 
ator A then acts on that vector to produce the final result. Operator multiplication is 
associative, 

(AB) C — A (BC) = ABC (7.7) 

since all three expressions acting on an arbitrary Y reduce to the same result 

A(B(CY)). 

However, operator multiplication is in general not commutative. In general 
AB ^ BA. The commutator of the two operators is another operator [.4, B] c defined 
by 37 

[A, B] c — AB - BA (7.8) 

If [A, B] c = 0, where here we use the number 0 for the null operator as noted above, 
then the two operators are said to commute. The commutator is anti- symmetric in the 
exchange of its two operators, and hence any operator commutes with itself, 

[B. A] c = - [A. B] c and [A,A] C = 0 (7.9) 

The evaluation of commutators is aided by some easily proved algebraic rules. With 
scalars ft, y, 

[A, m + yC)] c = P [A, B) c + y [A, C] c (7.10) 

[AB,C] c = A[B,C] c + [A,C] c B (7.11) 

IT, IS, n] c ] c + [H, [TGUc + IG , [H, T] c ] c = 0 (7.12) 

For every operator A there is another operator .4 1 called its transpose, which is 

37 The subscript c is to distinguish the commutator of two operators from the similarly denoted Poisson 
bracket defined in Section 4.6. The algebra of commutators resembles that of Poisson brackets, as may 
be seen by comparing the identities in eqns (7.10 - 7.12) with those in eqns (4.55 - 4.57). The algebraic 
similarity of commutators and Poisson brackets has important consequences in quantum mechanics, as 
discussed in Section 12.13. 
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defined by the condition that 

MV) • W = Y • yf T W (7.13) 

holds for any arbitrary vectors V, W. 

It follows from this definition that (Al 1 )' = A and that the transpose of a product 
of operators is the product of the transposes, but in reverse order. To establish this 
last result, note that eqns (7.6, 7.13) imply that 

(AES) • W = (A (BY)) • W = (BY) ■ A T W = V • B T A T W (7. 14) 

and hence, again using definition eqn (7.13), that 

(AB) j = B t A j (7.15) 



7.2 Operators and Matrices 

We know that, once an orthonormal basis e, is chosen for a three-dimensional vector 
space, a one-to-one relation can be established between vectors and the 3x1 matrices, 
called column vectors, made up of the vector components in that basis, 



( Vl \ 

V [V] where [ V ] = \ V 2 \ with V t = e, ■ V (7.16) 

W 

for i = 1, 2, 3. 

The relation is one-to-one because not only does every vector determine the its 
components by the last of eqn (7.16), but also, given its components, any vector V 
can be determined by writing it as 



3 

V = E v & (7.17) 

j = i 

Thus two vectors are equal, with V = W, if and only if [V] = [W], or in component 
form Vi — Wj for all i — 1, 2, 3. 

Operators are similar to vectors in that, once an orthonormal basis is chosen, each 
operator is associated uniquely with a matrix. But in the case of operators, the matrix 
is a 3 x 3 square matrix with nine components. 

Definition 7.2.1: Matrix Elements 

Assuming that a basis e,- has been chosen, there is a one-to-one relation between an 
operator and its matrix in this basis given by the definition 



/ F \ i F\2 A 13 \ 

T F where F = j F 21 F 22 F 23 I with F[j — e, ■ Tej (7.18) 

\ 63 1 F32 F33 / 

for i, j = 1, 2, 3. The nine numbers Fjj are called the matrix elements of operator T in 
the e,- basis. 
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Just as its components in some basis determine a vector V, so the matrix in some 
basis determines the operator. Imagine that the linear operator T operates on a vector 
expanded as Y = Vj e ; . Denote the result by W. Then 

W = = T ^ Vj e, | = J2 VjTtj (7.19) 

where the linearity of operators from eqn(7.1) was used to derive the last equality. 
Then the component Wj of vector W is 

3 3 

Wi = (e< • W) = Y, V J ( 6 «- • n i) = E F 'j y J (7-20) 

7 = 1 7=1 

where eqn(7.18) was used to get the matrix elements Fjj. Equation (7.20) can be 
written in matrix notation as 

/ Wi\ (F n F n F 13 \ /VA 

I Wi ) = I F 21 F 22 F 23 J [ Vh I or, more succinctly, [W] = F[V] (7.21) 
\W 3 J \F n F 32 f 33 ) \V 3 ) 

where the 3x3 matrix is denoted by single letter F . 

Thus, given any vector V, knowledge of the matrix elements Fjj will uniquely 
determine the vector W. Since operators are defined by their action on vectors, this 
defines T completely. Thus A — B if and only if A = B, or in component form 
Ajj — Bjj for all i, j — 1, 2, 3. 

The matrices corresponding to the null and unit operators are easily found from 
eqn (7.18). They are the null and unity matrices, with Ojj — 0 and Ujj — Sjj, respec- 
tively, where Sjj is the Kroeneker delta function. Thus 



O 



O = 



000\ 




/I 00 


000 


U 4= 


4 u = 010 


000/ 




1 00 1 



(7.22) 



The matrix A T corresponding to the transposed operator A T defined in eqn (7.13) 
has the matrix elements Ajj — Ajj for all i, j = 1, 2, 3. To see this, replace V by e; and 
W by e,- in eqn (7.13), and use eqn (7.18). The matrix element Aj. of matrix A T is 
thus 

Ajj = e; • .4 T e/ = (Ale,) • 6/ = e ; ■ (Ale,) = A jt (7.23) 

where the symmetry of dot products proved in Section A.2 has been used. 

The result in eqn (7.23) corresponds exactly to the definition of the transpose of a 
matrix in Section B.2. Also, eqn (B.24) shows that 

(AB) t =B t A t (7.24) 



which is consistent with eqn (7.15) for operators. 
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7.3 Addition and Multiplication 

As discussed in Section 7.1, operators can be added or multiplied. The matrices corre- 
sponding to the resulting operators are obtained by addition or multiplication of the 
associated matrices. 

Let operators A, B have corresponding matrices A , B , respectively. Let C = a A + 
fiB. Then using eqns (7.4, 7.18) gives the corresponding matrix C as 

Cij = e, • Cej = e; • (a A + fiB) e, = e, • (ctAej + fiBe/) = aA u + pBij (7.25) 

which may be written in matrix form as C = a A + fi B . 

Let V — AB. The corresponding matrix D is 

D ik = e, ■ Vh = e,- • (ABh) = e, • (A (Be*)) (7.26) 

where eqn (7.6) has been used to get the last equality. But, like any vector, Be* can be 
expanded in the e,- basis as 



3 3 

Bh = ^2 % = X! 

;=i l=i 

Putting this result into eqn (7.26) then gives 



(7.27) 



D ik - e, • ^ j = J2 (e/ • &j) B jk = J2 A 'J B J k (7 ' 28) 

Since the second index j of Ajj matches the first index of Bj k , eqn (7.28) is equivalent 
to the matrix multiplication D = A B . 

Equations (7.25, 7.28) maybe summarized as the correspondences 



a A - 



- fiB 
AB 



(*A l 

A B 



!B 



(7.29) 

(7.30) 



7.4 Determinant, Trace, and Inverse 

Given the basis e,-, eqn (7.18) defines the nine components F,j of the matrix F that 
corresponds uniquely to operator T, in exactly the same sense that the last of eqn 
(7.16) defines the three components V; of the column vector [V] that corresponds 
uniquely to vector Y in that same basis. 

If an alternate orthonormal basis e- is chosen, assumed also to be right-handed 
with e , | x e) = e^, then vector V and operator T will also have a unique relation to 
column vector [V'] and matrix F' in this alternate basis. The vector components and 
matrix elements in the alternate basis are given by the same formulas as in Section 
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7.2, but now using the primed basis vectors. Thus 

V[ = e • • V and F(j = e' • JFe'- (7.31) 

The unique relation between vectors and column vectors, and between operators and 
matrices, holds in either basis, and hence 

[V'] «=> V «=> [V] and F , <^J r <^>F (7.32) 

This chain of unique correspondence has the consequence that any equation involving 
matrices and column vectors in basis e, will be true if and only if the same equation 
is true when primes are put on all the matrices and column vectors, indicating that 
they refer to the alternate basis e- . 

In general, even for the same i, j indices, V{ and /-)' will be quite different from 
Vi and Fjj. However, there are certain quantities calculated from these numbers that 
have the same value no matter what basis is used. These are called invariant or basis- 
independent quantities. Two such quantities are the determinant and trace. 

Lemma 7.4.1: Invariance of Determinant and Trace 

If an operator T has corresponding matrices F and F' in the two bases e,- and e-, 
respectively, then 

|F| = |F'| and Tr F = Tr F ' (7.33) 

Proof: In Section 8.32 of Chapter 8, it will be proved that F' = R T F R where R 
is the matrix, expressed in the e,- basis, of a proper orthogonal operator 1Z defined by 
e- = 'R.itj for i — 1, 2, 3. This operator will be proved there to have the property that 
R T R = U = RR T and | R | = 1. It follows, using Property 5 and Property 10 of 
Section B.ll, that | F'| = | R T | | F | | R | = | F | as was to be proved. Also, from eqn 
(B.33), Tr F ' = Tr(R T F R) = Tr ( R R T F) = Tr F, as was to be proved. □ 

Since these quantities are basis independent, the determinant and trace of an 
operator may be defined by selecting any basis e,-, determining the matrix F corre- 
sponding to T in that basis, and setting 

detJC=|F| and Tr.F = TrF (7.34) 

It follows from definition eqn (7.34) and the corresponding properties of matrices in 
Section B.ll and eqn (B.33) that the determinant and trace of operators have the 
properties 



detAl T = det.4 det (AB) = det A det B (7.35) 

Tr (ABC) = Tr (CAB) = Tr (BCA) (7.36) 

An operator T may or may not have an inverse. If the inverse exists, it is denoted 
T~ x and has the defining property that, for both right and left multiplication, the 
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product of T with its inverse is the identity operator U, 

T~ X T — U = TT~ X (7.37) 

The inverse is unique. If two operators both are inverses of a given T, then they can 
be shown to be identical to each other. 

The necessary and sufficient condition for the inverse of an operator T~ x to exist 
is that det T / 0. This result follows from the definition in eqn (7.34) and the similar 
property of matrices proved in Section B.14. 

If C — AB, it is easily verified that the inverse is C~ l — B~ l A~ l , provided of 
course that the inverses of A and B exist. The inverse of a product is the product of 
the inverses, in reverse order. 

7.5 Special Operators 

If an operator S is identical to its transpose, <S T = S, then Sj, — Sji — Sjj and we say 
that it (and its matrix) are symmetric , 38 

An anti-symmetric (an alternate term is skew -symmetric) operator is in a sense 
the opposite of a symmetric one. Such an operator is equal to the negative of its 
transpose. If operator W is anti-symmetric, then VV T = — W and its matrix elements 
obey wj } = Wji - - W U . 

The most general anti- symmetric operator has a matrix containing only three in- 
dependent matrix elements, 



or equivalently 



( 0 —CO 3 (02 \ 

&>3 0 —CO 1 I 

— 0)2 0>1 0 / 



3 

Wij — "y SjkjCOk 
k= 1 



(7.38) 



(7.39) 



for i, j — 1. 2, 3, where <o \ , o> 2 , on are three arbitrarily chosen numbers that together 
determine W. 

The operation of an anti- symmetric operator Wona vector can be represented as 
a cross-product. 

Lemma 7.5.1: Equivalent Cross-Product 

If we define a vector <o whose components are the same three numbers coj found in eqns 
(7.38, 7.39), 

00 = 0>iei + ( 02^2 + 002^2 (7.40) 



then the action of operator W on an arbitraiy vector V is the same as the cross product 
of vector « with that vector, 



WY = « x V 



38 Matrix symmetries are treated in Section B.4. 



(7.41) 
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Proof: Let A=wxY. Then 

3 3 3 3 3 

A,.=e,.A = e,. W xV=££ o->k Vj (e, • e k x tj) = £ £ BikjUkVj = J2 W iJ V J 

7=1 1=1 7=1 1=1 7 = 1 

(7.42) 

which is the component form of the matrix equation [A] = W[V]. Since there is a 
one-to-one correspondence between operators and matrices, it follows that A = VVV, 
as was to be proved. □ 

If the anli-symmelric operator is given initially, the components of w can be ex- 
tracted from its matrix by 

3 3 

(Ok = 2 £ £ e ' kj Wij (7.43) 

“ i= 1 7 = 1 

where identity eqn (A.65) has been used. 

Another important special class of operators is orthogonal operators. An operator 
1Z is orthogonal if it has an inverse TZ ~ 1 and its inverse is equal to its transpose, 

U~ l = U T (7.44) 

Thus the property of inverses in eqn (7.37) implies that 

nvJ = u = n T n ( 7 . 45 ) 

for orthogonal operators. Orthogonal operators will be used to characterize rotations 
in Chapter 8. 

7.6 Dyadics 

There is yet another way of writing linear vector functions of vectors in common use: 
Dyadics. Those who have studied the Dirac notation in quantum mechanics should 
find them familiar. Those who have not can learn dyadics here and get a head start 
on mastering the Dirac notation. 

We begin by defining a single-termed dyadic, or dyad , D as a pair of vectors a and 
b written side-by-side with no operation between them such as dot or cross product, 

D = ab (7.46) 

This strange-looking object is intended to be an operator on vectors. But, unlike the 
operators defined above, it operates either to its right or to its left, and by means of 
dot products rather than directly. Thus, dotting D to its right onto vector V is defined 
to give 

D • V = a (b • V) (7.47) 

which is a vector parallel to a. The dyad D can also be dotted to the left on a vector Y 
to yield 

V • D = (V • a) b (7.48) 

which is a vector parallel to b. We see at once that left and right dotting will generally 
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give different output vectors, since a need not be parallel to b. By its definition in 
terms of the dot product, the dyadic operation is a linear function of vector Y. 

We define a law of addition for dyads similar to that for operators above. Suppose 
that two dyads are D i = ab and D 2 = cd. Then multiplication by scalars and addition 
as in 

D = aDj + /JD 2 (7.49) 

are defined by the rule that the operation of D on any arbitrary vector V is 

D • Y = aDj ■ V + /JD 2 • Y (7.50) 

A similar rule holds for left multiplication. The sum of one or more dyads is called a 
dyadic. The rule for the addition of two dyadics is the same as eqn (7.49) for dyads. 

Now suppose that we have a linear operator like the T discussed in Section 7.1. 
Since the matrix elements Fjj of this operator are just numbers like a and /I in eqn 
(7.49), we can define a dyadic corresponding to operator T by using this addition 
rule to write the nine-termed sum 



3 3 

IF = £ £ Jfyeje/ (7.51) 

i=l 7 = 1 

This dyadic is often denoted in equivalent ways, by using the freedom to write the 
numerical factor Fjj either before the pair of vectors, between the pair (as is often 
done in quantum mechanics), or after both of them, as in 

f = FiMi = £ £ ^ Fi& = ££ e,e, Fjj (7.52) 

1=1 j=l 1=1 7=1 i = l 7=1 

Conversely, if we are given a dyadic F, the matrix elements Fjj in the e, basis can be 
determined by dotting from both sides with unit vectors, since 

( 33 \ 33 

££**«* ) ■ ®/ = EE SikFkiS/j = Fjj (7.53) 

jfc=l 1=1 / k= l /= l 

As an example of a case in which the dyadic is given initially and the matrix and 
operator derived from it, consider the dyad D = ab in eqn (7.46). Then 

Djj = 6/ • D • e, = (e, • a) (b • e,) = cijbj (7.54) 

The matrix element of this simple dyad is just the product of the components of 
the two vectors. General dyadics, of course, will not have matrices with this simple 
product form. 

By its construction, the dyadic F dotted onto any vector V has the same effect as 
the operator T acting on that same vector. 
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Lemma 7.6.1: Equivalence of Operator and Dyadic 

IfYis any vector, then 

FV = F • V (7.55) 

Proof: Let W = FV define the vector W. Then from eqn(7.20) Wi = FijVj- 
The dyadic acting on V gives the same vector W, 

3 3 3 3 3 

F ■ v = E E f <j ( £ ; ■ y ) = E E F ’j v j - E £ «- w i = w (7.56) 

i = 1 j — 1 j=l 7=1 i=l 

which establishes eqn (7.55). □ 

Like operators and matrices, dyadics can also be multiplied. The product 

C = A • IB (7.57) 

is defined by considering its operation on an arbitrary vector Y. The dyadic IB is first 
dotted with vector Y and the dyadic A is then dotted onto the resulting vector, 



C • V = (A • B) • V = A • (B • V) 



(7.58) 



Thus, from Lemma 7.6.1, 



ABW = A • (B • V) 



(7.59) 



for any vector V. 

Like operator multiplication in eqn (7.7), dyadic multiplication is associative by 
definition, since 

(A • B) • C = A • (B • C) = A • E • C (7.60) 

Just as for operators in Section 7.5, the determinant and trace of a dyadic are 
defined to be the determinant and trace of its associated matrix in some basis. The 
inverse dyadic is the dyadic constructed from the inverse matrix, and exists if and 
only if the dyadic has a nonzero determinant. 

The transpose of a dyadic is constructed from the transposed matrix, using eqn 
(7.52). If a dyadic F has a matrix F then the transpose is defined as 

= E E F ftj = EI> F i*j (7.61) 

7 = 1 7=1 7 = 1 7 = 1 

It follows that left multiplication of F by V gives the same result as right multiplication 
of F t by the same vector. That is, 



V • F = F t • V 



(7.62) 



for any vector V. 
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7.7 Resolution of Unity 

Consider the identity operator U which has U\ — V for any vector Y. The dyadic 
form U of this operator is of particular interest. From eqn (7.22), 

3 3 3 

_ . X ^ X ^ A - A \ A A A A A A A /V /V f , 

U = 2^ 2^ e ‘ s 'j e j = 2^ e ' e ' = eiei + e2e2 + e3e3 (7.63) 

; = 1 7=1 ( = 1 

This dyadic is called a resolution of unity in basis e, . Since UV = V, it follows that 
U • Y = V. The resolution of unity can be used as a convenient device to expand 
vectors and other operators in a basis. For example, 

V = U • V = (eiei + e 2 e 2 + e 3 e 3 ) • V (7.64) 

= ei (ei • V) + e 2 (e 2 • V) + e 3 (e 3 • V) 

simply restates eqn(A.lO). Any vector V in any expression can always be replaced 
by either U • V or V • U. The result is always to expand the expression in terms of 
components in the resolution’s basis, in this case e, . 

7.8 Operators, Components, Matrices, and Dyadics 

The equation W = TS can now be written in four equivalent ways: operator, compo- 
nent, matrix, and dyadic: 

3 

W = Fij Vj [W]= F [V] W = F • V (7.65) 

j= i 

Each of the four expressions in eqn (7.65) is a different way of saying the same thing, 
and each of them implies the others. This, and the various other equivalences proved 
in the preceding sections of this chapter, can be summarized as a theorem, which we 
state here without further proof. 

Theorem 7.8.1: Equivalence of Operators, Matrices, Dyadics 

Any equation involving the addition, multiplication, transposition, and inversion of oper- 
ators, and the action of operators on vectors, will be true if and only if the same equation 
is true with matrices or dyadics substituted for the operators. In the matrix case, of 
course, the vectors must also be replaced by column vectors. 

As an example of this theorem, consider the following equivalent expressions, 



AB t \ + aC~ l VW = Y 


(7.66) 


3 3 3 3 

E E A u B Ji v k + « E E c u D JkW k = Yj 

7=1 k= 1 7=1 k= 1 


(7.67) 


A B t [V] + aC _1 D[W] = [ Y ] 


(7.68) 


A • IB T • V + aC -1 ■ D ■ W = Y 


(7.69) 



Each of them is true if and only if the other three are true. 
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The theorem of this section is of great use. It means that operator equations can 
be proved by proving the equivalent component or matrix equations in some basis, 
and vice versa. Throughout the text, we will use this theorem to go back and forth 
between operator and matrix relations, often with little warning, assuming that the 
reader understands that they are equivalent. 

For example, if T(r}) = A(rj) B{q) is a product of two operators, each of which is a 
function of some parameter rj, then the product rule for differentiation, 



dT(r\) 

drj 



dAOj) 

drj 



B(ii) + A(rj) 



dBOi) 

dii 



(7.70) 



follows from the usual product rule for differentiation of the component expansion 



3 

FikOl) = Y.AjjWBrtn) 
j= 1 



(7.71) 



where the matrix elements Ajj(r]), etc., are now just ordinary functions of 

For the remainder of this chapter, we will exploit Theorem 7.8.1 to translate the 
properties of matrices summarized in Appendix B into operator and dyadic forms. 
Since the proofs are given in Appendix B, we will often simply state the results here 
and refer the reader to that Appendix for more information. 



7.9 Complex Vectors and Operators 

A real vector is one whose components in some Cartesian basis are all purely real 
numbers. (The Cartesian basis vectors themselves are always considered real in these 
determinations.) If at least one component is an imaginary or complex number, the 
vector is complex. A general complex vector V may be written 

V = Vr + iV/ (7.72) 

where real vector V r collects all of the real parts of the components of V and real 
vector V/ collects all of the imaginary parts. For example, if V = 3ei + (2 — 4/) 62 + 6/63 
then Yr = 3ei + 262 and V/ = — 462 + 663. 

Operators and dyadics can also be real or complex. The definition is similar to that 
for vectors. An operator T and dyadic IF is real only if all of its matrix elements ; 
in some Cartesian basis are real numbers. If even one matrix element is imaginary 
or complex, the operator is complex. The transpose, and the definitions of symmet- 
ric, anti- symmetric, and orthogonal operators, must be generalized when complex 
operators are considered. 

The complex conjugate of an operator can be defined as that operator all of whose 
matrix elements in some Cartesian basis e, are the complex conjugates of the original 
ones. If T has matrix elements Fjj, then 

T* has matrix elements F* (7.73) 

Thus an operator is real if and only if T = T* . Otherwise, it is complex. 
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The generalization of transpose is Hermitian conjugate. The Hermitian conjugate 
of operator T is denoted T ^ and is defined in a way similar to eqn(7.13) for the 
transpose, by the condition that 



(FV)* • W = V* • (F f w) (7.74) 

for any vectors V, W. As was done for the transpose in eqn (7.23), basis vectors may 
be substituted for the vectors in eqn (7.74), giving the relation 

Ft = Ft (7.75) 

for the matrix elements /-V of matrix F T . 

Notice that the Hermitian conjugate can be considered as the combination of 
transpose and complex conjugate in either order, 

F = (f t )* = (F*) T (7.76) 

as can be seen by considering the matrix elements of each expression. 

If the operator is real, then the complex conjugations have no effect, and F = F 
holds. Thus the definition of Hermitian conjugate for possibly complex operators is a 
generalization of the definition of transpose for real ones. 

The generalization of symmetric is Hermitian. If an operator H is equal to its Her- 
mitian conjugate, then it is Hermitian. Then 'H ' = 'H and hence //L = //*. = //, ; . For 
real operators, the complex conjugation would have no effect, and hence a real Her- 
mitian operator is a real symmetric one. Thus the definition of Hermitian for possibly 
complex operators is a generalization of definition of symmetric for real ones. Simi- 
larly, anti-Hermitian operators can be defined that generalize anti-symmetric ones. 

The generalization of orthogonal is unitary. An operator T is unitary if it is non- 
singular and if its inverse is equal to its Hermitian conjugate, 

T~'=F (7.77) 

with the consequence that 

TF = U = FT (7.78) 

As seen above, for real operators there is no distinction between transpose and Her- 
mitian conjugate. Hence a real unitary operator would be a real orthogonal one. Thus 
the definition of unitary for possibly complex operators is a generalization of the def- 
inition of orthogonal for real ones. 

For complex operators, the determinants obey 

detF* = (detF)* and hence also detF 1 = (detF)* (7.79) 

Just as a complex vector can be written as the sum of its real and imaginary parts 
as in eqn (7.72), any complex operator can be written as the sum of two Hermitian 
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operators, 

T = Tr + iT, (7.80) 

where the Hermitian operators Tr and T] are 

Tr=\(t + r) and Ti = --(?- (7.81) 

7.10 Real and Complex Inner Products 

Recall that, in the space of real vectors, the inner product of two vectors can be written 
in column vector and component forms, as 

3 (Wy\ 

V w =£ Vi Wj — ( V! V 2 V 3 ) W 2 I = [V] T [ W] (real vectors) (7.82) 

i=i \W 3 J 

In a space of complex vectors, this definition of inner product must be modified. 
Orthogonality, norm, etc., for such a complex vector space are based on a generalized 
inner product consisting of the dot product and complex conjugation of the left-hand 
vector, 

3 /WA 

V* ■ W V*Wi = ( V* v 2 * V* ) \w 2 = [Vf[W] (complex vectors) (7.83) 

i=i " \w 3 ) 

Note that the transpose of the column vector [V] T in the real case becomes the Her- 
mitian conjugate [V ] 1 in the complex case. 

The redefinition of inner product for complex vectors is necessary in order to 
preserve an important property that dot products have in real vector spaces: The 
norm of a vector must be non-negative and be zero only for the null vector. Thus we 
have, with the redefinition, 

V 2 = || V || 2 = V* • V = (V* - iVi) ■ (V* + iV/) = IIVrII 2 + IIV/II 2 (7.84) 

which clearly has the desired non-negative property. The rule is that, when using 
complex vector spaces, one must always be sure that the left-hand vector in an inner 
product is complex conjugated before the dot product is taken. 

7.11 Eigenvectors and Eigenvalues 

An operator T acts on some vectors (but not others) in a particularly simple way: it 
gives an output vector which is just the input vector multiplied by a numerical scale 
factor. Those vectors are called the eigenvectors of T (“eigen” is German for “own”) 
and the scale factors (in general different for each eigenvector) are called eigenvalues. 
The equation 

= X k Y {k) (7.85) 

defines V® to be an eigenvector, and X k to be the associated eigenvalue, of operator 
T. The integer k labels the different eigenvalues and corresponding eigenvectors that 
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T may have. The set of eigenvectors and eigenvalues of an operator may in many 
cases characterize it completely, as we will see, and so their determination is of par- 
ticular importance. 39 

We may rewrite this equation, and its equivalent matrix equation in some basis, 
in the forms 



{T - k k U) V® = 0 and ( F - k k U ) [V w ] = 0. (7.86) 

where [V®] is the column vector of components V- k) = e,--V® of eigenvector \ <k> in 
the chosen basis. The matrix equation, and hence the operator equation also, has a 
solution other than the null vector if and only if 

det (T — kkU) = 0 with matrix equivalent | F — kk U | = 0. (7.87) 

This cubic equation has three eigenvalue solutions A| , kj. 7 3 which may in general be 
complex numbers. For each of those solutions, ( F — kk U ) has rank less than three, 
and so a non-null eigenvector solution to eqn (7.86) can be found. 40 These eigenvec- 
tors are usually normalized by dividing each one by its magnitude to produce a unit 

vector. The form of eqn (7.85) shows that these normalized vectors V (i> = V®/IIV®II 
are still eigenvectors. 

7.12 Eigenvectors of Real Symmetric Operator 

Real symmetric operators S, obeying <S T = S, are an important special case. We list 
here some properties of their eigenvalues and eigenvectors. The listed properties are 
proved in Section B.24. 

1. The eigenvalues kk of real symmetric operators are all real. 

2. Since all matrix elements Sjj are real numbers, the eigenvector solutions eqn 
(7.86) may be taken to be real vectors. 

3. If two eigenvalues are different, kk / k n , then the corresponding eigenvectors are 
orthogonal, V* • V„ = 0. 

4. Three orthogonal unit eigenvectors of S can always be found. These three eigen- 
vectors obey V*; • V„ = <5*„ and are said to form a complete orthonormal set. The 
word “complete” is used here to indicate that these three eigenvectors could be 
used as an orthonormal basis in place of e, if desired. 

7.13 Eigenvectors of Real Anti- Symmetric Operator 

The eigenvalue problem for the real, anti-symmetric operators described in Section 
7.5 is of particular importance in the study of rigid body rotations. Fortunately, the 
eigenvalues and eigenvectors of the most general anti- symmetric operator in three 
dimensions can be found in a standard form. 

39 The reader should refer to Section B.23 for more detail about finding eigenvalues and eigenvectors. 
40 See Section B.19. 
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Theorem 7.13.1: Eigenvectors of Anti-Symmetric Operators 

If W is a real antisymmetric operator obeying W T = — W with eigenvector equation 



- (k) ~ ( k ) 

WY = A.fcV 



(7.88) 



then its eigenvalues and corresponding eigenvectors are 

Ai = ico X 2 = —ico A 3 = 0 (7.89) 

and 

Y (1) = (a - ib) /V2, V (2) = (a + ib) /V 2 , V (3> = m/co (7.90) 

where w is the vector defined in eqn (7.40), co = ||«|| is its magnitude, a is some real unit 
vector perpendicular to m, and b = (m/co) x a is also a real unit vector, perpendicular to 
both a and m. 

Proof: Direct computation of eqn (7.87) using the matrix in eqn (7.38) shows the 

eigenvalues to be as stated. Lemma 7.5.1 showed that WV = w x V for any vector V. 

~ (k) ~ ( k ) ~ (k) 

Applying this result, one easily proves that WY = m x V = Ar-V for k = 1,2, 3, 
as was to be proved. □ 

Note that the three eigenvalues of an anti-symmetric W are always distinct. They 
would be equal (all zero) only in the case co — 0 which would imply a null operator. 
The three eigenvectors of W are orthonormal using the extended definition of inner 

product appropriate for vectors with complex components discussed in Section 7.10 

* (k)* ~ (/) 

above. They are easily shown to obey V -V = Ski. 




Fig. 7.1. Construction of eigenvectors of W 

One might think that the eigenvalue problem for W is not really solved, due to 
the arbitrary choice of vector a. But in fact we have solved the problem as well as 
eigenvalue problems can ever be solved. To show this, we begin with a lemma. 

Lemma 7.13.2: Underdetermination of Eigenvectors 

Normalized eigenvectors are determined only up to an arbitrary phase factor exp (iak). 
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where are real numbers that may in general be different for the different eigenvectors. 

~ ( k ) 

If an eigenvector problem is solved to give an orthonormal set of eigenvectors V , then 

- ( k y ~ ( k ) 

Y = exp (ia k ) V (7.91) 

are also an orthonormal solution to the same problem. 

Proof: Equation (7.85) is homogeneous in the eigenvectors. Thus, when any nor- 

~ (, k ) 

malized eigenvector V is multiplied by a factor of the form exp (iak), the exp (iak) 
factors on left and right of eqn (7.85) will cancel and the result will still be an eigen- 
vector. The resulting set of eigenvectors will also still be normalized and mutually 
orthogonal, since 

firW'Y . .£<*)* ^(0 

IV I V = lexpzo^V ) • lexpia/V I = exp (fa/ — ia k ) V -V 

= exp (iai - iak) hi = hi (7.92) 

using of course the extended definition of dot product. □ 

It follows that the eigenvector equation eqn (7.88) can never determine a completely. 
It can be any unit vector lying in the plane perpendicular to go. To see this, use the 
real, orthogonal unit vectors a and b defined above to derive the identities 

exp ( ia ) ^a — /b^ = ^a — zb j and exp (— ia) (a + i b j = ^a ; + ib ^ (7.93) 

where 

& — cosa a + sinab and b = — sin a a + cos a b (7.94) 

are a and b rotated by the same angle a in the plane they define. Note that b = 
((a/co) x a / and a / • b =0 remain true. Thus the rotation of a by angle a to produce 
some other vector a 7 lying in the same plane leads to the new set of eigenvectors 

-( 1 )' -( 1 ) -( 2 )' -( 2 ) ~( 3 )' -( 1 ) 

V = exp (ia) V V =exp(-ia)V V ’ — exp (0) V (7.95) 

By Lemma 7.13.2, the eigenvector equation cannot distinguish between the two sets 
and so cannot determine the vector a. 

7.14 Normal Operators 

The properties of normal operators given below are proved for normal matrices in 
the last three sections of Appendix B. Their correctness as operator relations is a con- 
sequence of the one-to-one correspondence between operators and matrices proved 
in Section 7.8. Note that the eigenvectors in Section B.26 and following sections are 
denoted by whereas we are using [V/^] here. 
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An operator will have a complete orthonormal set of eigenvectors obeying 



~ ( k )* (/) 

V -V =Su 



(7.96) 



if and only if it is a normal operator. An operator A is called a normal operator if it 
commutes with its Hermitian conjugate, 

A^A — AA' or, equivalently, [A, A^] c = 0 (7.97) 

Most operators that might be used in mechanics are normal. Real symmetric, real anti- 
symmetric, real orthogonal, Hermitian, anti-Hermitian, and unitary operators are all 
normal operators. 

To exploit the properties of normal operators, we define a linear operator V to 
be the operator that converts each of the basis vectors ej into the corresponding unit 
eigenvector of the normal operator 



„ it) 

Vh = V (7.98) 

„ (k) 

for k = 1, 2, 3. Since the eigenvectors V will in general not be real vectors, it follows 
that V will not in general be a real operator. It follows from definition eqn (7.98) that 
the matrix elements of V in the e,- basis are 

D ik = e; • Vh = e, • V <k> = v/ k> (7.99) 



so that matrix element D,k is equal to the ;th component of the kth eigenvector, and 
the matrix D can be constructed by writing the components of the three normalized 
eigenvectors as its three columns, as in 



/ Vj (1) y® y< 3) \ 



D = 



v 2 (1) y 2 (2) y 2 (3) 

-0) y (2) y(3) 






7 



(7.100) 



The orthogonality condition eqn (7.96) may now be written out as 



Ski = V 



(k)* -(/) 



■ * = H v, v," 1 = J2 D ‘t D " = H D l D ‘i 



1=1 



1 = 1 



1 = 1 



(7.101) 



where the definition of Hermitian conjugate in eqn (7.75) was used. Thus 

Uki=S kl = (D + Dj w (7.102) 

for k, l — 1, 2, 3, and hence 

U = V^V (7.103) 



As proved in Theorem B.22.2 this is sufficient to prove that V is a unitary operator 
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obeying eqn (7.78) 



V^V = U = VV t 



Let us now define the operator £ by the two equivalent formulae 
£ = TT AV> and A = V£V f 

By eqns (7.30, 7.99), the matrix elements of £ in the e* basis are 



( D 1 A d) w = 



i= 1 1=1 



1=1 7 = 1 



= E vf k) *hvf l) = ktSu 



(7.104) 



(7.105) 



(7.106) 



where the component expansions of eqns (7.85, 7.96) have been used to write 



E A u v j l) - x i v f and E v i k) * v i l) = 8kI 

7=1 1 = 1 

Thus the matrix of operator £ is 

/l] 0 0 \ 

E = 0 A.2 0 

Vo oa 3 / 



(7.107) 



(7.108) 



a diagonal matrix with the eigenvectors as its diagonal elements. We say that the 
operator V reduces A to a diagonal operator £. 



7.15 Determinant and Trace of Normal Operator 

For a normal operator, the determinant and trace defined in Section 7.4 can be written 
in terms of the eigenvalues of the operator. Taking the determinant of the second of 
eqn (7.105) and using eqn (7.35) gives 

det A = det ( V£V t ) = det V det £ det X> f (7.109) 

Also, taking the determinant of eqn (7.103) gives 

1 = det U = det (vh = detP 1 det£> (7.110) 

Thus, noting the diagonal form of E in eqn (7.108), we obtain 

det A = det £ — ILi A. 2 A .3 (7.111) 

Similarly, taking the trace of both sides of eqn (7.105) gives 

Tr A = Tr (v£V^ = Tr ( V^V£ ) = Tr (U£) = Tr£ = Ai + k 2 + k 3 (7.112) 

where eqn (7.36) was used, and the final value of the trace was obtained by inspection 
of eqn (7.108). 
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7.16 Eigen-Dyadic Expansion of Normal Operator 

Any linear operator has an equivalent dyadic as defined in Section 7.6. However, 
for normal operators, that dyadic can be expanded in a form that depends only on 
the eigenvectors and eigenvalues of the operator. We will call this the eigen-dyadic 
expansion. Since operators and dyadics are equivalent, normal operators are thus 
completely determined by their eigenvectors and eigenvalues. Expansion of this sort 
are used, for example, in the proof of the Euler Theorem in Chapter 8. But they are 
also important to the reader because of their frequent use in quantum theory. 

Theorem 7.16.1: Eigen-Dyadic Expansion 

„ (t) 

If A is a normal operator whose eigenvalues X k and orthonormal eigenvectors V are 
known, then its dyadic A can be expanded in eigen- dyadic form 

3 

(k) * (A:)* 

V X k Y (7.113) 

k= 1 

which expresses A entirely in terms of the eigenvalues and eigenvectors of A. 

Proof: This result follows from the component expansion of the second of eqn 
(7.105), which is 



A o = EE DtkEuDjj = EE v[ k) X,S u vf l) * = vfhk v} k) * (7. 1 14) 

k= 1 1=1 k= 1 1=1 k= 1 

where eqns (7.99, 7.106) have been used. Substituting that result into the definition 
of the dyadic A in eqn (7.52) gives 



3 3 

EE 

i= 1 j= 1 



V V e,- Aijej = EEE 6 ' V^ k) hv} k) *e 

1=1 j= 1 k= 1 






(k) ~Jk)* 

X k \ 



(7.115) 



k= 1 



which is eqn (7.113). □ 

Thus the expansion of W = A • Y for a general vector V in eqn (7.56), can equally 
well be written as 

3 

E /v (k) (k)* 

Y V -V (7.116) 

jfc=i 

Note that the right vector in eqn (7.113) is already complex conjugated and is simply 
dotted onto Y in eqn (7.116) without change. 

As shown in Lemma 7.13.2, the eigenvectors are not uniquely determined. If each 

„ (k) 

V is multiplied by a factor exp (ia k ), the result will be an orthonormal set of eigen- 
vectors that are equivalent to the original ones. However, this indeterminacy does not 
affect the dyadic defined in eqn (7.113). 
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Lemma 7.16.2: Uniqueness of Eigen-Dyadic 

The eigen-dyadic in eqn (7.113) is uniquely determined even though the eigenvectors are 
not. 



Proof: 



, 41 ). - ( k ) 

Replacing V by exp (iak) V 



in eqn (7.113) gives 



v — ^ / * {k)\ / - (k)\ * 

Y. (exp ( iak ) V J Ik (exp ( ia k ) V J 

3 

( k ) * (&)* 

exp (iak ~ ’Uk) V a^V 



*=i 



*=i 



3 

(k) * (/:)* 

V AfcV (7.117) 



which is identical to the original dyadic A. 



□ 



The resolution of unity dyadic in eqn (7.63) of Section 7.7 can also be expanded 
in terms of the eigenvectors of a normal operator. The second equality in eqn (7.104) 
implies that 

8 U = J2 Dik Dlj = J2 Vi k) Vj k) * (7. 118) 

k= 1 k = 1 

It follows that 






i = l 7 = 1 



i=l 7=1 k= 1 



= ^v (t) V (k) * 

*=1 



(7.119) 



which is the required expansion. Note that the only difference between eqns (7.113, 
7.119) is that the former multiplies the terms by the eigenvalues X k before adding 
them. 



7.17 Functions of Normal Operators 

The eigen-dyadic expansion eqn (7.113) can be used to define general functions of 
normal operators and dyadics. 

Definition 7.17.1: Functions of Normal Operators 

If a function f(z) is well defined for all eigenvalues A/, of a normal operator A, then the 
dyadic function F = /(A) of the dyadic A is defined by 

3 

( k ) ~ (&)* , 

V f(X k )\ (7.120) 

k= 1 

which is the same as eqn (7.113), but with f(X k ) replacing X k . 

The operator function T — f(A) of the normal operator A is then defined by the 
condition that its effect on any vector V be the same as that of the dyadic: (FY = F • Y. 
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This definition has the consequence that T — f (A) has the same eigenvectors as 
does A, and eigenvalues yk — fAk), 

- (k) ~ (k) 

= y k \ where y k = f(X k ) (7.121) 



To see this, note that 

3 3 

- (, k ) « (k) - (/) - (/)* - (k) - (/) - (k) 

= F-V =J2 y ' V = E V /(^l)i» = /(>•*) V (7.122) 

i=i i=i 

This result is important because it proves that any well-defined function of a normal 
operator is also a normal operator, with the same orthonormal set of eigenvectors. Of 
course, the eigenvalues y k in general are different from the eigenvalues X k . 

If the function / is a very simple one, like f(z) — z" where n is some positive 
integer then, as we would expect, T is the product of A with itself n times, as in 



n factors 

A = A n =A^AA (7.123) 

For example, consider the case n = 2. Then eqn (7.113), and the orthogonality rela- 

. ^ (&)* ^ (/) 

tion V -V = Ski, give 

2 a « / v - ' 4VW, /v - ' Jv(h , *(/)* 

A" = A • A = I / Y X k V 1 • I y V X] V 

U=1 / \/=l 

3 3 , 3 

x — ^ ^ ^ (a) ^ (/)* x — ^ ^ (k ) o ^ (a)* 

= I] £ V WklW = E V (7.124) 

L=1 /=1 fc=l 

which is the definition in eqn (7.120) for this function. This result for n = 2 can be 
generalized in an obvious way to any positive integer n. 

The definition eqn (7.120) is well defined even if the function f(z) does not have 
a power series expansion. But if it does have one, the following theorem applies. 

Theorem 7.17.2: Function as Power Series 

If function f(z) has a power series expansion 



CO 

f — f(z ) = a 0 + a\z + « 2 Z 2 H = ^ a n z n (7.125) 

n = 0 

and all eigenvalues Xk of A lie in the circle of convergence of the power series, then the 
operator function T — f(A) in Definition 7.17.1 equals a convergent power series in 
operator A, 



J~ — f (- 4 ) — ciqIA 4 “ ci[A 4 ~ aiA~ 4 - • • • 



(7.126) 
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Proof: Using eqn (7.123) repeatedly gives 

3 

(&) / o \ * (&)* 

V (a 0 + a x Xk + a 2 X~ k + ■ ■ • J V 



(7.127) 



k= 1 



Y Y 



*=i 



V AirV 

k= 1 



■a 2 J2v +• 

*=i 



— “I" «iA -(- #2 A -|- • 



which converges whenever all eigenvalues of A lie in the circle of convergence of the 
power series. The equivalent operator equation is then the power series 

J~ — f (*d) — ciq 1A 4~ G[A 4~ u 2 A~ 4~ * * * (7.128) 

□ 



7.18 The Exponential Function 

The exponential function is of particular importance in the treatment of rotation op- 
erators. The power series expansion of this function, 

,2 3 

f(z) = exp(z) = 1 4-z4- ^ -] (7.129) 

converges for any z,- Thus, for any normal operator A, Theorem 7.17.2 shows that the 
function T — exp (A) can be expanded in a power series as 

A 2 A 3 

f = exp(A)=U + A+ — + — + ••• (7.130) 

~ (k) 

Then, if A has eigenvectors V and eigenvalues X k , the operator T — exp (A) is also 
a normal operator with the same eigenvectors, and eigenvalues yk — ex P Ak)- 

If 9 is a scalar, and if A does not depend on 9, then the function f(z.) = exp (Gz) 
produces the power series 

Q 2 Ac 4 3 2 1 3 

T(9) = exp (6 A) = U + 6A + ^ ^ + • • • (7.131) 

Differentiating eqn (7.131) term-by-term gives 

d 0 2 A^ / 0 2 A 2 \ 

exp {9 A) = A 4~ 9A~ 4~ — — — 4~ * * * == A ( IA 4~ 9 A 4~ — h * * * ) = A exp (9 A) 

(7.132) 

which shows that T (9) defined in eqn (7.131) is a solution to the operator differential 
equation 

^p- = AF{9) (7.133) 

d0 

where the initial condition JF(0) = U is assumed. 
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If two operators commute, then they can be manipulated in the same way as 
ordinary numbers. Thus it follows from the same proofs as are found in standard 
calculus books that if operators A and B commute, [A, B] c = 0, the exponential 
functions will have the property that 

exp (A) exp (B) = exp (A + B) = exp (B) exp (A) (7.134) 

Since any operator A commutes with (—A), it follows that 

exp (A) exp (—A) = exp (—A) exp (A) — exp (0.4) = U (7.135) 

Thus, whether A is singular or nonsingular, T — exp (QA ) is always nonsingular, with 
the inverse T~ x — exp (—0.4). 

7.19 The Dirac Notation 

Quantum mechanics uses complex vectors and operators similar to those described 
in Sections 7.9 and 7.10. The main difference is that the quantum vectors may have 
infinite dimension. 

Quantum mechanics also uses a different notation for complex vectors, called the 
Dirac notation. 41 We have denoted a vector by V where the use of bold-face type 
indicates that it is a vector, and the letter “V” is a label indicating which vector it is. 
The Dirac notation denotes a vector by what is called a ket \ V) where the | } indicates 
that this is a vector, and the letter “V” is its label. 

Inner products, which we write Y* • W, are written by reversing the ket to form 
what is called a bra (V|, so that together the two parts of the inner product form a 
bra-ket ( V W). Note that the bar is not doubled in the inner product of a bra and a 
ket. 

Operators are variously notated. One common notation, which we will adopt here, 
is to place a hat symbol over the operator. For example, an equation that we would 
write W = TS would be \ W) = T | V) in Dirac notation. 

The Dirac notation is essentially dyadic. The dyadic F defined in eqn (7.52) is 
written in Dirac notation with the ket and bra vectors poised to make inner products 
to the left or the right. Thus, the dyadic associated with operator T would be written 

3 3 

•^=EE \e t ) F iV (e v \ (7.136) 

i=l i '= 1 

where the matrix element that we write Fa> — e, • (Fe,' is written as 

F u , = ( ei \T\e v ) (7.137) 

The kets |e, ) here are the Cartesian unit vectors that we denote by e,. Notice that in 
quantum mechanics, the distinction between operator T and associated dyadic F is 
ignored. So, in eqn (7.136) the operator is considered to be equal to its dyadic. 

41 Most quantum texts treat the Dirac notation. For a definitive statement of it, see Dirac (1935). 
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Eigenvectors that are labeled by an index k are often denoted by kets using just 
that index as their label. Thus eqn (7.85) in the Dirac notation is 

T\k)=X k \k) (7.138) 

„ (£) 

where the eigenvector we denoted by V is denoted simply by | k). This extreme 
freedom in choosing labels for bras and kets is one of the strengths of the Dirac 
notation. The orthogonality of the eigenvectors in eqn (7.96) becomes simply (. k\ I) — 

hi- 

The Eigen-Dyadic of a normal operator defined in eqn (7.113) is then written 

3 

T=Y J \k)^k(k\ (7.139) 

k= 1 

and the resolution of unity from eqn (7.119) is written 

3 

U = Y^ \ k )W (7.140) 

k= 1 

In the Dirac notation, the definition of Hermitian conjugate is extended to apply 
also to bras and kets. Since, from eqn (7.83), the inner product is expressed in terms 
of component column vectors by using the Hermitian conjugate of [V], 

(V| W) = [Vf[W] (7.141) 

the Dirac notation defines the bra (V| as the Hermitian conjugate of the ket | V), as 
in (V| = |V)U The bras are considered to be a separate vector space, called the dual 
space, and expressions like (V| + | W) adding a bra and a ket make no sense and are 
forbidden. 

The flexibility in labeling kets leads to certain limitations in the Dirac notation. 
If a ket is multiplied by a number, that number cannot be taken inside the ket. Thus 
a \x[r) ^ \cnjf) since the expression on the right is nonsense, a label multiplied by a 
number. Also, in the case of the eigenkets | k) such a usage could lead to errors. Clearly, 
3 1 1 ) / |3) since the eigenkets |l)and |3) are distinct members of an orthonormal set 
of eigenvectors. 



7.20 Exercises 

Exercise 7.1 A 3 x 3 real matrix R can be thought of as three 3x1 column vectors. 



1 


~Rn~ 


R = 


R21 


V 


_ R31 _ 



R n 

Rh 

R32 



Rn 1\ 
R23 I 
R33 _ / 



(7.142) 



(a) Using the formalism of sums and indices, write out the i j components of both sides of the 
equation 

R t R = U (7.143) 

and show that it is true if and only if the three column vectors in eqn (7. 142) are normalized 
and mutually orthogonal (i.e. an orthonormal set of vectors). 
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(b) Show that eqn (7. 143) is true if and only if R 1 exists and R T = R 1 , and hence that a 
real matrix is orthogonal if and only if its column vectors are an orthonormal set. 

Exercise 7.2 In the following, you may use the fact that whenever operators A and B com- 
mute, so that [A, B] c = 0, then 

e"^e^ = qA+B) _ e S e -4 (7.144) 



[Although it doesn’t matter for this exercise, note that this equation is not true when they fail 
to commute.] 

(a) Suppose that A is a normal operator. Prove that B defined by 

B = e A =U + A + ^A 2 + H (7.145) 



is also a normal operator. 

(b) Prove that if eqn (7. 145) holds and A is a normal operator, then 

det B = e TrA (7.146) 

(c) Let A be a normal operator, which may or may not be singular. Prove that an operator B 
defined from this A by eqn (7. 145) has an inverse given by 

B~‘=e~ A (7.147) 



and hence is nonsingular. 

(d) Prove that if A is a real, anti-symmetric operator, then the B defined in eqn (7.145) will 
be a real orthogonal operator. Find the value of its determinant det B. 

(e) Use the power series expansion of the exponential to prove that 

CBC~ l =e CAC ~ 1 (7.148) 

where C is any nonsingular operator, and eqn (7. 145) is assumed to hold. 




Fig. 7.2. Illustration for Exercise 7.3. 

Exercise 7.3 Consider a plane mirror. Denote the unit vector normal to its surface and point- 
ing out into the room by n. 

(a) The operator A4 converts a general vector Y in front of the mirror into its reflected image 
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y(M) _ _/\yf y behind the mirror. Find a general expression for its matrix elements M,j in 
terms of the components n; of vector n. [Hint: Write V and \ (Ml as sums of vectors parallel 
and perpendicular to n using eqn (A.3 ).] 

(b) Is A4 an orthogonal operator? What is det Ml ? 

(c) Write the dyadic M corresponding to operator A/1, expressing it in terms of the unit dyadic 
and n. 

Exercise 7.4 Refer to eqn (A.3) of Appendix A. Operators V\\ and V±, which are called 
projection operators, are defined by 



V|, = V\\W V_l = V±Y 



(7.149) 



for any general vector V. 

(a) Find the matrices of these two operators, writing them in terms of the components of n. 

(b) Prove that 



(7>ll) 2 = iP|l (P ± ) 2 = V ± P||Pl = 0 = VxV\\ V\ \+V±=U (7.150) 



(c) Are V\\ and V± orthogonal operators? Do they have inverses? 

(d) Write the projection dyadics Py and Pj_ corresponding to V\\ and V±, respectively. Write 
them in terms of the unit dyadic and the vector n. 



Exercise 7.5 In Section 7.5, the general anti-symmetric operator W 
vector w by its action WY = w x V on any arbitrary vector V. 

(a) Use the eigenvalues and eigenvectors listed in eqns(7.89, 7.90) 
1,2,3, 



- (k) - (i) 

WV = k k Y 



is defined in terms of a 
to verify that, for k — 
(7.151) 



(b) Verify that these eigenvectors are orthogonal and normalized, using the extended defini- 
tion of dot product appropriate for complex vectors. 



(&)* ^ (/) 

V • Y ( =S kl 



(7.152) 



(c) Derive the identities given in eqns (7.93, 7.94). Use them to write out the alternate eigen- 

(k)f xv / 

vectors V in eqn (7.95) in terms of a , b , and w. 

Exercise 7.6 A complex, spherical basis, e ( ( j p) for m — — 1, 0, +1, in a three-dimensional 
Cartesian space may be defined as 



s(sp) 

B +1 



1 



= (- £ i + /e 2 ) e‘ sP> = = ~^= (ei + ie 2 ) 



(sp) 



V2 



V2 



(7.153) 



(a) Prove that these basis vectors are orthonormal, using the complex inner product defined 
in Section 7.10, 

(7.154) 



'(sp)* '(sp) _ „ 

c m c m' — °mm' 



(b) Show that the dyadic 



r i _ :(sp)'(sp)* , '(sp)'(sp)* '(sp)'(sp)* 

— e + i e + i f e Q e Q f e_j 



(7.155) 



is equal to the resolution of unity dyadic (U defined in eqn (7.63). 
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(c) Use the resolution of unity in eqn (7.155) to prove that a general vector A can be expanded 
as 

A = A '» P) ^ P) where A m P) = 6 m P) * ' A (7.156) 

m=+l,0, — 1 

[Recall that dyadics in complex spaces are written with the right-hand vector already com- 
plex conjugated so that no further complex conjugation is required when dotting them onto 
vectors.] 



Exercise 7.7 

(a) Apply the expansion in eqn (7. 156) to the radius vector r. Write the resulting r,'„ sp) com- 
ponents in terms of x,y,z. 

(b) Write the components r^ p) in terms of spherical polar coordinates, and demonstrate that, 
for m = +1, 0, — 1, 

(0,4>) (7.157) 

where the Tj, (0. <p) are the standard spherical harmonics for i = 1 as listed, for example on 
page 337 of (Shankar, 1994). 

Exercise 7.8 

(a) Use the definition of components in eqn (7.156) with the standard Cartesian expansion 
A = A , e, to show that the spherical components can be written in terms of the Carte- 

sian ones as 




3 

A^ P> — T m jAj or, equivalently, 

i=t 



A 

A 

A 



(sp) ■ 
+1 
(sp) 



0 

(sp) 

-1 



= T 



Ai 

A 2 

A 3 



where 



t — s( s P)* 

1 mi — 



(7.158) 



(7.159) 



(b) Demonstrate that the matrix T must be unitary, and check that the matrix you wrote is 
indeed unitary. 

(c) Using the resolution of unity eqn (7.155) or otherwise, show that the equation B = TA 
can be written as the equivalent component equation 



B% p) = V F (sp) ,A (sp) where F (sp) =TFT f (7.160) 

m Z / mm' m v ; 

m'=+ 1,0, — 1 



gives the spherical matrix F ,sp) in terms of the standard Cartesian matrix F defined in eqn 
(7.18). 

Exercise 7.9 An operator T is defined in terms of the unit operator U and a real, anti- 
symmetric operator W by 

T = U + W (7.161) 

The operator W is associated with a given vector &>, as described in Lemma 7.5.1. 

(a) Show that .Fisa normal operator. 

~ (3) 

(b) Show that V = w is an eigenvector of T . 

(c) Find the eigenvalues of T and hence prove that T is nonsingular for any value of w. 
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Exercise 7.10 Consider a real, anti- symmetric operator TV. Suppose that the associated vec- 
tor discussed in Section 7.5 is the unit vector n = (ei + 2^) /*/5. (Thus A f\ — n x V for 
any vector V.) 

(a) Show that a = S 3 is a suitable choice of the unit vector a discussed in Section 7.13. Use it 
to find three eigenvectors of TV. Show that they are orthonormal, using the complex definition 
of inner product from eqn (7.83). 

(b) Consider now an operator defined by T — U + 27V — 37V 2 , where U is the unit operator. 
Find eigenvectors and eigenvalues of T . 




8 



KINEMATICS OF ROTATION 



In this chapter, we develop the techniques needed to define the location and orienta- 
tion of a moving rigid body. Roughly speaking, rotation can defined as what a rigid 
body does. For example, imagine an artist’s construction consisting of straight sticks 
of various lengths glued together at their ends to make a rigid structure. As you turn 
such a construction in your hands, or move it closer for a better look, you will notice 
that the lengths of the sticks, and the angles between them, do not change. Thinking 
of those sticks as vectors, their general motion can be described by a class of linear op- 
erators called rotation operators, which have the special property that they preserve 
all vector lengths and relative orientations. 

8.1 Characterization of Rigid Bodies 

The concept of a rigid body is an idealization, since all real objects have some degree 
of elasticity. However, the theory in the present and following chapters, based on this 
idealization, provides a good first approximation to the behavior of many real objects. 

Definition 8.1.1: Definition of Rigid Body 

A rigid body can be defined as a collection of point masses such that the distances between 
them do not change. If r/ and r„ are the locations of any two masses mi and m n in the 
body, relative to some inertial coordinate system, the body is rigid if and only if the 
distances di„ defined for all I, n values by 

r/ — r„=d/„ and ||d/„||=d/„ (8.1) 

remain constant as the body moves. 

We will refer to vectors d/„ between masses m/ and m n as internal vectors. 

The above definition implies that the dot product of any two internal vectors is a 
constant, regardless of where the masses occur in the rigid body. 

Lemma 8.1.2: Constancy of Dot Products 

For any masses m a , nib , m p , m q of a rigid body, 

d a b ■ d pq = constant (8.2) 

Proof: The lemma is proved in two stages. First consider any three distinct masses 
m a , nib, nip and the vectors between pairs of them. They form a triangle so that 

dap d bp — dab (8.3) 

As the rigid body moves, this triangle will remain anchored to the same three masses. 
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Now calculate the squared magnitude of the left and right sides of eqn (8.3), 

\\d ap - d bp \\ 2 = ||tU|| 2 or d 1 ap -2(A ap -A bp )+dl p =d 1 ab (8.4) 

The constancy defined in eqn (8.1) then implies that the squared terms in eqn (8.4) 
are all constants, and hence that the dot product must also be constant. The dot 
product of vectors d ap and d bp, both of which start from mass m p , must therefore also 
remain constant. 

Now consider any two internal vectors of the rigid body. Call them d a i, and d pq . 
Picking some other mass, which will be labeled with index c, we can write 

dab — d ac d/,^ and d pq — d^, r d qc (8.5) 

Therefore the expression 

dfl/j • d pq = (dfl C dbc) * (dpc (8.6) 

contains only dot products of internal vectors originating at the common single mass 
m c . But all such dot products have just been proved to be constant, so d a b • d pq must 
be a constant, which completes the proof of the lemma. □ 




Fig. 8.1. Relations among internal vectors. 



The dot product of two internal vectors is the product of their magnitudes, which 
are constant by the above definition, and the cosine of the angle between them. Thus 
the above definition and lemma also establish that the relative angle between any two 
internal vectors must remain constant as the body moves. 

8.2 The Center of Mass of a Rigid Body 

The center of mass R of any collection of point masses, including that of a rigid body, 
is given in eqn (1.32) as 

1 N 

R =-y>„ r « (8.7) 

n= 1 

The relative position vector p„ of mass m n is then defined in eqn (1.33) by the equa- 
tion 

r„ = R+p )? (8.8) 

The relative position vectors may be used to give an alternate characterization of a 
rigid body. 
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Fig. 8.2. Center of mass and relative vectors for a rigid body. A typical mass m n of the body is 
shown. 



Lemma 8.2.1: Constant Dot Product of Relative Position Vectors 

A body is a rigid body if and only if, for all masses mi and m n , 



Pi ■ p„ = constant (8.9) 

Proof: First show that eqn(8.9) implies the constancy of di „ , and hence that the 
body is a rigid body according to Definition 8.1.1. Equation (8.8) shows that any 
internal vector may be written 



= r/ — r„ = p; — p„ (8.10) 

since the R terms cancel. Thus 

df n = din ■ din = (p I ~ P„) ' (p? ~ P„) (8.11) 

Equation (8.9) implies that all dot products in the expansion of the right side of eqn 
(8.11) are constant. Hence each d/„ is constant, as was to be proved. 

Now prove the converse, that Definition 8.1.1 implies eqn (8.9). Use eqns(8.7, 
8.8) to write, for any mass m„, 



N 



N 



1 A 1 A 

P n = r » - R = m 2^ m P {tn ~ r P> = M L m P d ‘ 
P= 1 P = 1 

Using eqn (8.12), the expression p t ■ p„ in eqn (8.9) becomes 

J N N 

Pi ' P n — /W 2 ^2 T m P m ^lp ■ d nq 

p=\q=\ 



np 



( 8 . 12 ) 



(8.13) 



which contains only dot products of the form d j p ■ d nq , all of which were proved 
constant by Lemma 8.1.2, which completes the proof. □ 

In general, the center of mass will not be at the location of one of the point masses. 
In fact, for a hollow body like a basketball or a teacup, the center of mass may be at 
some distance from the masses. But eqn (8.9) with l = n implies that the distance 
of the center of mass from any of the point masses is a constant. The center of mass 
moves rigidly with the body just as if it were one of the point masses. 
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8.3 General Definition of Rotation Operator 

Rigid bodies have been defined by the condition that the dot product of any two 
relative position vectors p, • p„ must remain constant as the body rotates. We now 
investigate a class of linear operators called rotation operators that preserve the inner 
product of any two vectors and are therefore appropriate for describing the rotation 
of rigid bodies. These rotation operators will be applied to the kinematics of rigid 
bodies in Section 8.9. 

The first property of rotation operators is linearity. Let a general rotation operator 
1Z transform a general vector Y into the vector \ <R> , 

y(tf) =7e V (8.14) 

Since we want operators that reproduce the behavior of rigid bodies, the first require- 
ment placed on this operator must be that a triangle of internal vectors such as eqn 
(8.3) must be transformed into the same triangle of transformed vectors with 

= C cs-is) 

or, introducing the operator, 

Kdap - TZdbp = 1Zd ab = 1Z (d ap - d bp ) (8.16) 

This condition is satisfied by requiring 1Z to be a linear operator, as defined in eqn 
(7.1) of Chapter 7, so that for any vectors V and W, 

1Z (aV + /?W) = alZY + filZW (8.17) 

However, linearity alone is not sufficient. In addition to being linear, the rotation 
operator must satisfy the conditions of the following definition. 

Definition 8.3.1: Rotation Operator Defined 

A rotation operator, sometimes referred to as a rotation, is defined as a linear opera- 
tor that also satisfies any one of the following three equivalent definitions. Each of the 
definitions implies the other two. 

1. Given vectors V and W, define V ( ^’ = TZY and W (S) = 1ZW. Then a linear 
operator 1Z is a rotation operator if only if 

Y • W = Y (R) • W (S) (8.18) 

is satisfied for any, arbitrary V, W. 

2. The linear operator 1Z is a rotation operator if and only if there is some or- 
thonormal triad of vectors e i , £ 2 , £3 obeying 

e, • e, = S u (8.19) 

such that e\ R \ . e ^ R 1 is also an orthonormal triad of vectors, obeying 



where e R) = A.e, for each index i — 1, 2, 3. 



( 8 . 20 ) 
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3. A linear operator 7Z is a rotation operator if and only if it possesses an inverse 
and its inverse is equal to its transpose, 

7Z~ X exists, and TZ~ l = 1Z T (8.21) 

so that 

n J n = u = izvJ (8.22) 

where U is the unit operator. As discussed in Section 7.5, this is the definition 
of an orthogonal operator, so this definition requires 1Z to be a real, linear, 
orthogonal operator. 



w 




Fig. 8.3. The lengths ofthe rotated vectors andW ( ^ are the same as the original vectors. 
Also the angle between them is same as between the original ones. 

Proof: (Proof of equivalence) We now prove that the condition in each of these 
definitions implies the condition in the following one, in the pattern l=j>2=^3=^l. 
This implies that an operator 1Z satisfying any of the definitions will also satisfy the 
other two, and thus that the three definitions are equivalent. 

Since the vectors V and W in Definition 1 are assumed arbitrary, they can be taken 
to be e, and e/. Thus the condition of Definition 1 implies that of Definition 2. 

The condition in Definition 2 can be written, for /, j — 1, 2, 3, 

e ; W ■ e ( j R> = TZti ■ Utj = Sjj (8.23) 

Using the definition of the identity and transpose operators from Section 7.1, this 
becomes 

e,- • TZ J TZCj = Sjj — e,- • Utj or (v7lZ^ — Ujj (8.24) 

Since all of the matrix elements of the operators ('R V 'R) and U are equal, the operators 
are equal and 42 

rJr = U (8.25) 

Taking the determinants of both sides of eqn (8.25) and using eqn (7.35), shows that 

(deCK) 2 = (detft T )(detft) = det U = 1 (8.26) 

with the result that 

det 1Z= ±1 ^ 0 (8.27) 

42 As proved for matrices in Theorem B.22.1, eqn (8.25) is actually a necessary and sufficient condition 
for TZ to be an orthogonal operator. That proof is repeated here. 
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Thus 1Z is nonsingular, and, 1Z 1 exists. Using U = 1Z1Z 1 and eqn (8.25) gives 

TZJ = 1Z T U = IZJlZlZ = URT 1 = ft -1 (8.28) 

which is the condition in Definition 3 and implies eqn (8.22). 

Introducing the operator 1Z, the condition in Definition 1 can be written using the 
definition of the transpose operator in eqn (7.13), 

v • w = v ( *> • w (R) = ozy) ■ (nvf) = v • (iz T izw) (8.29) 

Thus the orthogonality condition R y R, — U from Definition 3 implies the condition 
of Definition 1, completing the circle of inference. □ 



8.4 Rotation Matrices 

From the general discussion of linear operators in Section 7.2, we know that Y i R> = 
1ZY implies and is implied by the equation 

3 

V t iR) = J2 R ijVj where R,, = e, • IZej (8.30) 

j= i 

are the matrix elements of the matrix R associated with the operator 1Z. 

Equation (8.30) gives the components V t (R) of Y (R> in the e,- basis, in terms of the 
components Vj of the original vector V in that same basis. It may also be written in 
matrix form, as 

[y (S )]=R[V] (8.31) 

where [V] is the column vector of components Vj and [V iR) ] is the column vector of 
components V yR) . 

As an example of a rotation operator, consider the rotation denoted TZ[0g^\, a 
rotation by angle 0 about the £3 axis. The second of eqn (8.30) shows that the matrix 
element A’/yis the dot product of e, with the rotated image of e ; -. Thus Rjj — e,- • e (R} . 
Evaluating these dot products gives the matrix 

( cos 6 — sin 6 0 \ 

sin 9 cos 0 0 ) (8.32) 

0 0 l) 

The reader should verify all of the matrix elements of eqn (8.32), and also check that 
this matrix, and hence the associated operator, are orthogonal and have determinant 
equal to plus one. A general prescription for deriving the operators and matrices for 
rotation about any axis will be given in Section 8.18. 
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8.5 Some Properties of Rotation Operators 

By Definition 2 of Section 8.3, the rotated images of the basis vectors e, are also 
three, mutually orthogonal unit vectors and hence form a basis in the space. Like any 
vectors, these rotated basis vectors may be expanded in the original basis, as 

e' R) = (e« • e, w ) = £ e„ (e fl • fte,-) = £ *“ Rai = E R ^ a (8 ‘ 33) 

a = 1 a = 1 a= 1 <2=1 



Note that the basis vectors transform using the transposed matrix R T . 

It is useful to define rotated versions of the Kroeneker delta function and the Levi- 
Civita function defined in Section A.5. It follows from Definition 1 of Section 8.3, that 
the rotated Kroeneker-delta function is the same as the original one, 



JR) _ J R ) J R ) — p- p. — x. . 

°ij ~ e / e ; — e ' e / — 

Using eqn (8.33), the rotated Levi-Civita function may be expanded as 

3 3 3 3 3 3 



(8.34) 



Rai Rbj Rck&a x • e c = EEE Rai Rbj R(:k &abc 

(8.35) 



a = 1 c=l 



a=l fo=l c=l 

It follows from eqn (B.37) and the properties of e a bc listed in Section A.5 that 

3 3 3 



JR) 

'123 



= EEE* alRblRclZabc — I R I = detft 



(8.36) 



a= 1 b= 1 c=l 



Since exchange of two indices of implies the exchange of two corresponding 
indices of e a b c in eqn (8.35), one obtains 



JR) _ JR) x JR) 

e ijk ~ e / x e / 



• e[ R) = del'll Si jk 



(8.37) 



8.6 Proper and Improper Rotation Operators 

Equation (8.27) states that the determinant of a rotation operator must be either +1 
or —1. Rotation operators with det 'R, = +1 are called proper rotation operators, or 
proper rotations. Those with with det 7 1 = — 1 are called improper rotation operators. 
These operators are also referred to as proper or improper orthogonal operators. 

For example, consider the identity operator U. It is orthogonal, since U T = U and 
therefore UU T = U 2 = U. The identity can be thought of as a degenerate proper 
rotation (by zero angle), since det U — +1. 

But the total inversion operator T = —U, which converts every vector Y into —V, 
is also orthogonal since TT t = U 2 =U. But detT = — 1 and so the total inversion 
operator is an improper rotation. 
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The distinction between proper and improper rotations is of no importance for dot 
products, since, by Definition 1 of Section 8.3, 

V W. W (*) = Y. W (8.38) 



in either case. 

But cross products are sensitive to the distinction, as proved in the following the- 
orem. 

Theorem 8.6.1: Rotated Cross Products 

With the definitions A iR) — 7ZA, = TUB, and C (R) — TZC, 

A = B x C implies A (R) = (det ft) (b w x C w ) (8.39) 

Proof: Writing A = Ylk=i A k e k , with a similar expansion for B and C, it follows from 
the linearity of operator 1Z that 

3 3 3 

A (R) =KA = nJ2 A kh = J2 A k nh = J2 ft-ef ’ (8.40) 

k= 1 k= 1 k=\ 



with a similar expressions for B 7 '’ 1 and C (R> . Thus 



B®xC<*> = EE^-C^xef) = £££>C; (e ; (S) x ef • e[ S) ) ef ' (8.41) 



*=i ;=i 



i= 1 ;'=1 *=1 



where the last expression expands the vector e, (K) x e { : R> in the rotated basis. Using 
eqns (8.37, 8.40), and the expansion of cross products from eqn (A.16), then gives 



B </?) x C (S) = (det ft J2J2J2 s ijkBiCj* ( ; 



(R) 



i = 1 7=1 k= 1 



as was to be proved. 



3 

(det ft J2 A kt{ R) = (detftA w 

jfc=l 

(8.42) 

□ 



We will be concerned almost entirely with proper rotations, for which det 1Z = + 1 . 
Using eqn (8.39), we may now give a necessary and sufficient condition for 1Z to be a 
proper rotation operator. 

Definition 8.6.2: Proper Rotation Operator 

The linear operator 1Z is a proper rotation operator if and only if it satisfies Definition 
2 of Section 8.3 as well as the condition that the original and the rotated basis vectors 
e [R) = JZe, for i — 1, 2, 3 both form right-handed systems, obeying 



ei x e 2 = e 3 



and 



'JR) 



x e 



( R ) 



= e 



(R) 



(8.43) 
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8.7 The Rotation Group 

As for any linear operator, the product of two rotation operators is defined to mean 
successive application. Thus, for any vector Y, the product TZ = TZ \ R, 2 implies that 

TZ\ = {TLiR.2) V = Hi (R 2 \) (8.44) 

in which the right operator R 2 is applied first to V and the left operator R\ is then 
applied to the result. 

A set of objects is said to form a group if a binary operation called group multipli- 
cation of the objects is defined and if a set of group axioms is satisfied. The common 
usage is to say that the objects form a group under that particular group multiplica- 
tion. We show that proper rotations form a group under the operator multiplication 
defined in eqn (8.44). 

1. The first axiom is closure. The group product of two objects must be an object 
in the same group. Thus, the product of two proper rotations must also be a 
proper rotation. If TZ = R, \ R 2 and rotations R\ and R 2 both satisfy Definition 3 
of Section 8.3, then 

rt x = nz{R 2 r x = l n~ l = tzJtzJ = {tz{iz 2 ) t = r t (8.45) 

shows that TZ also satisfies the same definition and hence is also a rotation. 
Moreover, if R,\ and R 2 are proper rotations, then 

detTH = det(JZ\TZ 2 ) = detT^i det 7^2 = (+1) (+1) = +1 (8.46) 

shows that 1Z is also a proper rotation. Thus closure is proved. 

2. There must be an identity in the group such that pre- or post-multiplication of 
any object by that identity does not change the object. We have previously seen 
that the identity, or unity, operator U is a proper rotation operator. 

3. Every object in the group must have an inverse in the group, such that pre- 
or post-multiplication of that object by its inverse yields the identity object. As 
noted in the proof of Definition 3 of Section 8.3, the inverse TZ~ l of a proper 
rotation always exists. To see that the inverse is also a proper rotation, set B — 
TZ~ l = TZ t , where TZ is a proper rotation operator. Then 

B~ l = ('K -1 )” 1 = TZ = {K t ) T = B t (8.47) 

which shows that B is a rotation operator. Also I = det'A’. = det B l = det B 
shows that B is a proper rotation. 

4. Group multiplication must be associative. Proper rotation operators obey 
(7^i7^2) TZt, — R\ (7^2^3) since, as discussed in Section 7.1, both sides are equal 
to 7Zi1Z 2 7Zt,. 
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The group of proper rotations is designated SO (3), which stands for the special (de- 
terminant equal to +1), orthogonal group in three dimensions. 

If the product of a pair of elements gives a result independent of their order, the 
group is said to be Abelian. The rotation operators form a non-Abelian group. A finite 
rotation 1 Z\ 1 Z 2 will not usually give the same end result as a finite rotation 'Ri'IZi . 

For example, place a closed book on the table in front of you, as if preparing to 
open and read it. Rotate it by 90° about a vertical axis, and then by 90° about an 
axis running from your left to your right hands. Now replace the book in its original 
position and do the same two rotations in reverse order. You will see that the final 
orientation of the book is indeed different. 

We say that finite rotations do not commute. Writing the commutator of 1Z[ and 

TZ 2 as 

[72-1, TZ 2 ] c — 'R-i’R- 2 — KiFli (8.48) 

we express this result by saying that proper rotations have in general a nonzero com- 
mutator and so form a non-Abelian group. 



8.8 Kinematics of a Rigid Body 

Let a rigid body have a center of mass R and relative position vectors p„ . As the rigid 
body moves, both R and the p„ will be functions of time. At r=0, the position of the 
mass m„ relative to the origin of some inertial coordinate system will be 

r„(0) = R(0) + p„(0) (8.49) 

and at time t the location will be 



r„(t) = R(t) + p„(f) (8.50) 

As proved in Lemma 8.2.1, the dot product of any pair of relative position vectors is 
constant (including that of a vector with itself, giving its magnitude squared) . Hence, 
these dot products also will be the same at all times t, 

P i(t) • p„(f) = p,(0) • p„(0) (8.51) 

Thus the problem of parameterizing the orientation of a rigid body (by which we 
mean defining the location of all of its masses m n once its center of mass is known) 
boils down to finding an expression for the evolution of vectors p n (t) that obey eqn 
(8.51) at all times t. 

The first step toward such a parameterization is to construct a system of coordi- 
nates tied to the rigid body. With the rigid body at its initial position and orientation at 
time zero, it is always possible to select three non-coplanar relative position vectors. 
For simplicity, suppose that these are the first three of them pj(0), p 2 (0), p 3 (0) . Now 
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Fig. 8.4. The body system unit vectors e'- are rigidly fixed in the moving body. The relative 
position vector p n is also fixed in the body. Hence its components in the body system are 
constants. 

apply the Schmidt orthogonalization method 43 to these vectors to construct a right- 
handed, orthonormal set of unit vectors e , ] (0), e^IO), e 3 (0). Thus, by construction, 

3 3 3 

ej(0) = 5>i*p*(0) and 8 U = e'(0) • e'(0) = EE oLikUji p*(0) - P/(0) (8.52) 

k= l k= l 1=1 

where the aik are coefficients specified by the Schmidt method. 

Now define vectors e-(?) at time t by 

3 

e i (t) = J2°‘ikPk(t) (8-53) 

k= 1 

where the cq& factors in eqn (8.53) are defined to be the same as those in eqn (8.52). 
It follows from eqns (8.51, 8.52) that 

3 3 

e'i(t) ■ e'-(r) = EE 0iik0Cjip k (t ) ■ Pi (?) 

k= 1 1=1 

3 3 

= E E P * (0) • Pi (0) = e' (0) • e' (0) = 8 U (8.54) 

k= 1 1=1 

which shows that the e- (?) are also an orthonormal set of unit vectors for all t. 

The coordinate system consisting of these three orthonormal vectors e- (t), with 
its origin at the center of mass, will be called the body system. The relative position 
vectors can be expanded in this body system as 

3 

P n (t) = J2p'ni(t)e i (t) (8.55) 

1=1 

Equation (8.51) implies that the components p' ni (t) will be constants, always equal 

43 



See Section B.20. 
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to their values p' ,(0) at time zero, 

3 3 

p' m (t) = e-(0 • p„(r) = ^auPkd) ■ p n (t) = E^Pi-iO) • p„(0) = e-(0) • p„(0) = p' ni (0) 

i=i i=i 

(8.56) 

Thus the angles between the various e-(f) and p „(?) will never change. The e-(r) are 
rigidly connected to the body and turn with it as it moves. 

8.9 Rotation Operators and Rigid Bodies 

The time evolution of a rigid body can be systematized by defining a time-dependent 
rotation operator IZ(t) by the condition that it maps each e-(0) of the body system at 
time zero into its value e-(f) at time t, as proved in the following theorem. 



Theorem 8.9.1 Define a time dependent operator IZ(t) by the condition that, for i = 
1,2,3, 





-PI 

II 

o 


(8.57) 


It follows that IZ(t ) is a 


proper rotation operator obeying 






U(t) J TZ(t) =U = K(t)K(t) T 


(8.58) 


and det7 Z(t) = +1 for all time t. 




It abo follows that 


P „(t) = 7l(t)p„(0) 


(8.59) 


and that 


P/it) ■ p n (t) = P/ (0) ■ p„(0) 


(8.60) 



as is required for rigid bodies. 

Proof: Identify e- (?) with the rotated basis vector ej R 1 in Definition 2 of Section 8.3. 
Since eqn (8.54) proved the e-(r) to be an orthonormal system of basis vectors, it fol- 
lows from Definition 2 that lZ(t) is a rotation operator. Hence, by equivalent Definition 
3, it obeys eqn (8.58) at all times t. 

A general relative position vector p„ (r) can be expanded in the body system e-(f) 
basis as given in eqn (8.55), 

3 

P n(0 = ^/t4-(*)ej(0 (8-61) 

i = 1 

The components in this expansion were shown in eqn (8.56) to be constants, with 
p' ni (t) — p' nj { 0). Thus, using the linearity of 7 Z(t), 

3 3 / 3 

Pn (0 = E P'ni (°) 3 « = E P'ni (°)^W ^ (0) = ^(0 E Pni (°> ^ (») 

/=i i=i \;=i 




as was to be proved. 



= K{t) p„(0) 
(8.62) 
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It then follows from the orthogonality of lZ(t) and the definition of transpose in 
eqn (7.13) that 

p lit) • Pf , (t) = n(t)p,( 0) • ^(f)p„(0) = p,(0) • n T (tmt) p„(0) = P/ (o> • P „(0) (8.63) 

which is eqn (8.60). 

It follows from eqn (8.57) that at time zero, 72.(0) — U, which has determinant 
+ 1. Since the vectors p„(r) of the rigid body, and hence the body system unit vectors 
e-(f), are assumed to evolve continuously with time, the determinant cannot make 
a discontinuous jump to the only other possible value —1. Thus det 72. (f) = +1, IZ(t) 
is a proper rotation, and the body system unit vectors e-(f) remain a right-handed, 
orthonormal triad for all time t. □ 



8.10 Differentiation of a Rotation Operator 

We now have an operator 72.(7) that allows any vector of a rigid body at time t to be 
expressed in terms of that vector at time zero. But Lagrangian mechanics also needs 
expressions for the velocities of the point masses of the rigid body. To obtain these 
velocities, we now derive the time derivatives of the operator 72(r) and of the vectors 
rotated by it. 

Suppose that 72(7) acts on an arbitrary constant vector V to produce a time-varying 
rotated vector \ {R] (t) as in 

Y (R \t) = 72(7) V 

Taking the derivative of eqn (8.64) gives 

d\ {R) (t) _ dTL{t) 
dt dt 

since V is a constant. The meaning of this last equation is perhaps made clearer if we 
express eqns (8.64, 8.65) in component form as 



(8.64) 



(8.65) 



and 



3 

v?\t) - J2 R u m 

7=1 

dVj lR \t) _X dR u {t) v 

dt c i t j 

7=1 



( 8 . 66 ) 



(8.67) 



Comparison of eqns (8.65, 8.67) shows that d1Z{t)/dt is that operator such that each 
of its matrix elements is the time derivative of the corresponding matrix element 
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Rij(t). Written out, the matrix of dlZ(t)/dt is 



dR\\ 


dR\2 


d/?i3 


dt 


dt 


dt 


d /?2i 


d R 22 


dR23 


dt 


dt 


dt 


dR^i 


d R 32 


dR?,?, 


dt 


dt 


dt 



( 8 . 68 ) 



in which each element is differentiated. 

Continuing, we use eqn (8.58) to write eqn (8.65) in the form 

d\ (R) (t) d1Z(t) dlZ(t) T dU(t) T / d v 

— — ^ - = —^u\ = = — - —n(n T Y {R) (t) (8.69) 

at at at at 

where eqn (8.64) has been used to get the last equality. Defining the operator W(f) 
by 

W(r) = ^W) T (8.70) 

at 

then gives 

dyt >(t) = W(t)V {R \t) (8.71) 

dt 

which expresses the time derivative in terms of the current value of Y (R Ht) at time t. 

The real, time-varying operator W(t) defined by eqn (8.70) is anti-symmetric. 
From eqn (8.58) we have U = Differentiating both sides of this equation 

with respect to t using the product rule eqn (7.70) gives 

dU dlZ(t) T dlZ(t) T 

° = — = — — K(t) T + 

at at at 

dlZ(t) T / dlZ(t)\ T T 

= U(t) T + 72.(0 (— — ) =W(r) + W(t) T (8.72) 

dt V dt ) 

which implies the anti-symmetry 



W(r) T = -W(t) 



(8.73) 



In deriving eqn (8.72) we used the fact that taking the transpose of an operator and 
then differentiating it with respect to time produces the same result as doing the same 
two operations in reverse order. This operator identity follows from the same identity 
for matrices 



d R (r) T /d R (r)\ T 
dt - V dt / 



(8.74) 



which can be obtained by inspection of eqn (8.68). 
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In Section 7.5, we determined that the most general real, anti-symmetric operator 
acting on a vector is equivalent to a vector « acting by means of a cross product. Thus 
there is some vector co(f) such that 

W(t)A = a o(t) x A (8.75) 

for any arbitrary vector A, where vector A itself may or may not be time-varying. 
Hence the time derivative in eqn (8.71) can be written 

(IY< ><n = W(t)Y iR \t) = w (t) x V w (f) (8.76) 

dt 

where w (f) is in general time varying since VV(f ) is. 

The vector w(f) in eqn (8.76) is called the angular velocity vector of the time- 
varying rotation. Expanding this vector in the fixed, inertial e, basis, 

w(f) = &>i(r)ei +a> 2 (t)h +a>3(t)h (8.77) 

the matrix of operator W(t) can be obtained from eqns (7.38, 7.39) with the time 
dependence added. The matrix elements are 

3 

Wjj(t) = y jlkj co k (t) (8.78) 

k= 1 

and, written out, the matrix is 

/ 0 — &>3(f) 0J2(t) \ 

W(f)= m(t) 0 -&>i(r) 1 (8.79) 

\— a>2(t) a>i(t) 0 / 



An operator differential equation for IZ(t) can also be written. Multiply both sides 
of eqn (8.70) from the right by lZ{t) to get 



dlZ(t) T dlZ{t) dTZ(t) 

W(t)lZ(t) = = — — 

dt dt dt 



(8.80) 



and hence the differential equation 



dTZ(t) 

dt 



= w t)n(t) 



(8.81) 



8.11 Meaning of the Angular Velocity Vector 

First, it is useful to establish some notation for later use. The angular velocity vector 
«(r) has a magnitude a>(t) and an associated unit vector w(f) which we will typically 
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denote as n(f) in order to make it easier to distinguish from w(t) itself. Thus 

n (0 = «(0 = ^ (8-82) 

co(t) 

In component form, this equation is 

ni(t) = —y-r (8-83) 

o>(t) 

for i = 1, 2, 3, where the unit vector n(f) has the expansion 

n(r) = m(t) ei + n 2 (t) e 2 + n 3 (f) e 3 (8.84) 



Hence, the angular velocity may be written as a magnitude times a unit vector direc- 
tion, 

&)(?) = co(t) n(f) or in component form coi(t) = co(t) n,(f) (8.85) 



Dividing eqns (8.78, 8.79) by the magnitude w(t ) allows one to define a new op- 
erator A fit) — \V(t)/o>(t) with matrix elements based on the axis unit vector n(r). 
Thus 

Wij(t) 



with matrix 



such that 



w(t) 



= Nij(t ) = Y2e ikj n k (t) 



k= 1 



w (t) 

m(t) 



( 8 . 86 ) 



0 


-«3(0 


n 2 (t) \ 




«3 (t) 


0 


-mW 


(8.87) 


- n 2 (t ) ni(t ) 


0 ) 




and W (t) = 


o)(i)N(0 


(8.88) 



Also, dividing both sides of eqn (8.75) by the magnitude oj(t) shows that the action 
of the operator A f (t) is equivalent to a cross product with the unit vector hit) as in 



Af(r)A = n(t) x A 



(8.89) 



for an arbitrary vector A. 

With this notation established, now consider the angular velocity vector. Multiply- 
ing eqn (8.76) by dt gives the differential relation 

dV iR) (t) = W(t)dtV (R \t ) = <a(t)dt x V iR) (t) (8.90) 

which we may rewrite as 

d\ {R \t) = (w{t)dt)N(t)\ m {t) = (co(t)dt)h(t) x V (S, (f) (8.91) 

For small enough dt, the differential dV w (f) approximates the change in 
during that time interval. From the properties of cross products, this change is a vector 
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perpendicular to both n(r) and Y (S) (0, with magnitude (®(f) dt) V (fi) (f) sin O^y where 
0,„\> is the angle between the two vectors. Geometrically, this is a rotation of vector 
about an instantaneous axis whose direction is given by the unit vector n(f), 
with a rotation angle r/O defined as 

d<t> — u>(t) dt so that d<t> h(t) — (o(t) dt (8.92) 

The angular velocity vector w (t) thus has a magnitude o>(t) that gives the instan- 
taneous rate of rotation a>(t ) = d<S>/dt, and an associated unit vector w(r) = n (t) that 
gives the instantaneous axis of rotation. 

In general, both of these quantities will change with time. Thus, even if eqn (8.92) 
could be integrated to obtain some angle 4>, in general that angle would be meaning- 
less since each of the increments d<t> takes place at a different time and hence about a 
different axis. 

Note that vectors parallel to the instantaneous axis are not changed at all in time 
interval dt since the cross product of the two parallel vectors in eqn (8.90) will vanish. 




Fig. 8.5. Geometry of the angular velocity vector o>. The differential d\^ is seen to be per- 
pendicular to both co and V ( ^', corresponding to the cross product in eqn (8.90). 



8.12 Velocities of the Masses of a Rigid Body 

The theory of Section 8.10 can be used to find the time derivative of the relative 
position vector p n (t) discussed in Section 8.2. 

From eqn (8.59) of Section 8.9, there is a time dependent rotation operator IZ(t) 
such that p n (t) = 7?.(f)p„(0). Replacing V <s, (0 by p„(0 and V by p„(0) in eqn (8.64) 
allows eqn (8.76) to be written as 

dp " (t) = u)(t) x p„ (t) (8.93) 

dt 

This important formula was used in eqn (1.64) in Chapter 1, and will be used exten- 
sively in our discussion of rigid body dynamics. 
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The time derivative of eqn (8.8) then gives the velocity of mass m n relative to the 
inertial origin as 



d r„ (Q _ dR(t ) d p„ (t) 

dt dt dt 



dR(t ) 
dt 



+ «(r) x p„ {t) 



(8.94) 



It follows that the most general differential displacement of a rigid body in time dt 
can be described as a differential displacement dR of its center of mass, together with 
a rotation by an angle d<t> — co(t) dt about an instantaneous axis n(f ) passing through 
the center of mass, 



dr n = \ n dt — Ydt + w(f)dr x p„ (?) = dR + d<J>n(f) x p n (t) (8.95) 

8.13 Savio’s Theorem 

In Section 3.3, we asserted that the cohesive forces holding a rigid body together do 
no virtual work. The results of Section 8.12 allow us to give a proof. 

Before presenting the proof, we note that, although eqn (8.95) refers to a differ- 
ential displacement in a time dt, it is actually more general. The parameter dt could 
be replaced by any parameter that varies monotonically as the body moves. Thus, the 
most general virtual displacement of a mass m„ of a rigid body, in the sense defined 
in Section 3.2, is given by 

<5r„ = <SR+ <50 n x p„ (8.96) 

where n is some axis and <50 is some angle. This is the most general virtual displace- 
ment that is consistent with the rigidity of the body. 

Theorem 8.13.1: Savio’s Theorem 

If Axioms 1.4.1 and 1.5.1, the laws of linear and angular momentum, 

— =F (ext) and — = x (ext) (8.97) 

dt dt 

are assumed to hold for a rigid body, considered as a collection of point masses, then the 
internal forces of cohesion will do no virtual work. 44 

Proof: For a rigid body, we identify as internal forces of constraint ff, lnri all those 
forces that are not explicitly external. Then, as discussed in Sections 1.4 and 1.5, eqn 
(8.97) implies that 



N N 

F (int) = ^f< int) =0 and T (int) = £r„ xfj, int) =0 (8.98) 

n = 1 n = 1 

The virtual work done by these internal forces of constraint is defined in Section 3.3. 

44 The theorem that the general laws of momentum are sufficient to establish the vanishing of rigid-body 
virtual work was derived by the late Mario Savio while he was a graduate student at San Francisco State 
University. 
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In vector form, it is 



N 

SW (cons > = J2^n nt) ■ Sr „ 

n = 1 



(8.99) 



Using the virtual displacement from eqn (8.96) and the definition p„ = r„ — R from 
eqn (1.33) gives 



N N 

S W (cons) = • SR + Y ^. int) • x (r« - R) ( 8 . 100) 

n = 1 n= 1 

Factoring quantities that have no index n out of the sums, and rearranging a triple 
scalar product, this becomes 

<W(cons) = F (int) • SR + t (int) • <$4> n - F (int) • <54>n x R (8.101) 

Equations (8.98) imply that each term on the right in eqn (8.101) is zero, and hence 
that SW (cons) = 0, as was to be proved. □ 

8.14 Infinitesimal Rotation 

Consider again the rotated vector Y (R \t) in eqn (8.64) of Section 8.10. The difference 
A V (S, (f) between the vectors Y (R) (t + dt) and Y {R Ht) may be approximated by the 
differential d\ iR) (f) from eqn (8.91). The error of this approximation approaches zero 
in the limit as dt goes to zero. As discussed in Section D.12, the differential dt is 
not assumed to be a small quantity. But when it is large, the approximation of the 
difference A Y iR \t) by the differential dY {R) (t) will in general be poor. 

Thus we may use the definition of angle d<t> from eqn (8.92) to write 

A Y (R \t) = Y iR \t+dt)-Y (R \t) = dY {R) (t)+o(dt) = d<i>JV(t)Y (R \t)+o(dt) (8.102) 

and hence 45 

Y {R) (t + dt) = V (S) (0 + d<S>JV(t)Y {R) (t) + oidt) 

= (U + d<5>AT (t))Y (R) (t) + o(dt) = Ki[d<&n(t)]Y (R \t) + o{dt) (8.103) 

The operator 

Te 7 [dOn(r)] = U + d<$>N{t) (8.104) 

defined in this equation will be referred to as an infinitesimal rotation operator. To 
order o(dt) in the limit dt -> 0, it transforms Y {R \t) into its value Y iR) (t +dt) at time 
t + dt . The notation 'R/ [44m (t)] should be read as the rotation by angle dd> about 
instantaneous axis fi(r). 

45 The symbol o(dt) is discussed in Section D.ll. Including it in an equation means that terms of smaller 
order than dt are being dropped. In the present context, this means that terms in dt 2 or higher powers are 
dropped since \im c i t ^o(dt) n /dt = 0 for n > 2. 
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Note that the operator 7£/[d<t>h(f)] is indeed a rotation when terms in dt 2 and 
higher powers are neglected, since it satisfies the orthogonality condition 

■K/[dOn(f)] 7e/[d<Dh(r)] T = (U + d<t>M(t)) (U + dd>7V(r)) T 

= {U + d$>N(t)) (U - d<D7V(r)) = U + o{dt) (8.105) 

due to the cancellation of the terms that are linear in d<£> and hence linear in dr. 

8.15 Addition of Angular Velocities 

In Section 8.10 and subsequently, we have referred to the angular velocity w(r) as 
a “vector.” However, vectors have more assumed properties than just the ability to 
be used in cross products as in eqn (8.76). For example, vectors can be added, and 
their sum is independent of the order of the addends. We now use the concept of 
infinitesimal rotation to understand the geometrical meaning of expressions like 

w(r) = W a (t) + u>b (t ) = Mbit) + Mait ) (8.106) 

If the addends are assumed to be angular velocity vectors like the ones discussed 
above, we now show that the sum to(r) in eqn (8.106), in either order, is also a legiti- 
mate angular velocity vector corresponding to the same definite infinitesimal rotation. 
If eqn (8.106) is assumed, then we can use eqn (8.92) to write 

dOn(r) = M(t)dt = ti» a (t)dt + Mb(t)dt = d<P a h a (t) + d^>b^bit) (8.107) 

where = co a (t) dt and = cob(t) dt are the differential angles of the “a” and “b” 
rotations. In operator form, this is 

d<$>N(t) = d<S> a N a {t) + d<$> h N b (t) (8.108) 

where operators N(t), N a (t), and Mbit) are related to the unit vectors n(f), n a (t), and 
hb(t) as in eqn (8.89). 

Then eqn (8.104) gives the infinitesimal rotation corresponding to vector «(f) as 
■JlrtdQhit)] =U + d<S>AT(t) =U + d<& a Na(t ) + d® b M b (t) (8.109) 

But when terms of order dt 2 are dropped, this can be written 

1Zi[d<&n{t)] = (U + d<& a J\fa(t)) ( U + d<t>bMb(t)) + o(dt) (8.110) 

or 

ft/ktt>n(r)] = TZi[d<t> a n a it)] ^ 7 [d^n 6 (t)] + o(dt) (8.111) 

The sum of two angular velocity vectors thus corresponds to a compound infinitesimal 
rotation consisting of two successive infinitesimal rotations, first the one produced by 
M b dt, followed by the one produced by M„dt. 
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But, again due to the neglect of terms of order dt 2 , we could just as well write the 
products in reverse order, 

nrtd&hit)] = (U + d<$> h N h (t)) ( U + d<S> a N a (t)) + o(dt ) (8.112) 

or 

■K/[dOh(r)] = n I [d$bn b (t)] Th[d^ a n a {t)] + o(dt) (8.113) 

So the sum of two angular velocity vectors corresponds also to a compound infinitesi- 
mal rotation consisting of two successive infinitesimal rotations in the opposite order, 
first the one produced by to a dt, followed by a second one produced by M/,dt. 

Thus, with the understanding that terms dt 2 and higher are to be dropped, the 
sum in eqn (8.106) corresponds to the product of the two infinitesimal rotations in 
either order, 

K,[d<t> a h a (t)] Udd^bhbit)] = Kj[d4>h(t)] = n f [d<t> h h b (t)] n,[d<P a n a (t)] (8.114) 

The sum of two angular velocity vectors, in either order, corresponds to the same 
product of two infinitesimal rotations, since the order of their application makes no 
difference when terms containing dt 2 and higher powers are dropped. 

Thus the vector to (r) in eqn (8.106) is a legitimate angular velocity and corre- 
sponds to the same definite, unambiguous infinitesimal rotation regardless of the 
order of addition. Angular velocities like to (f) are thus vectors and have the algebraic 
properties associated with them. 

Note to the Reader: Equation (8.114) illustrates an important fact. Although finite 
rotations do not commute in general, infinitesimal rotations always commute. 



8.16 Fundamental Generators of Rotations 

Since angular velocities can be added, it is legitimate to consider the expansion of co (t) 
into its three Cartesian components to represent the product of three infinitesimal 
rotations. These three rotations are now considered. 

Let 

w(0 = &>i(t)ei + W2(t)h + coi(t)h (8.115) 

as in eqn (8.77). The components &>,(r) in this expansion can be used to define the 
new quantities d<t>j by 

= a>i(t)dt — d<S>ni(t ) (8.116) 

for i = 1, 2, 3, where the second equality follows from eqn (8.85) and the definitions 
d<$> = o>(t) dt and n, (7) = o>i(t)/o>(t). In vector form, the component definitions in eqn 
(8.116) are equivalent to 

od(t)dt = d<$h(t) — d<t> i ei +d4>2e2 + dO 363 (8.117) 



Applying the argument that led from eqn (8.109) to eqn (8.111), it follows that 
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the infinitesimal rotation corresponding to w(f) dt can be written as 

ft/[d3>n(f)] = 7^/[d$2 £ 2 ] ft/[<tt> 3 e 3 ] + o(dt) (8.118) 

where the order of the operators on the right makes no difference to the product, 
since by assumption terms containing dt 2 and higher powers are being dropped. The 
components of the angular velocity vector can be thought of as producing a product 
of three infinitesimal rotations with angles h4>,- = wj (t) dt about the corresponding 
coordinate axes e ; , with the order of these infinitesimal rotations having no effect on 
the final outcome. 

Equation (8.104) shows that each of the three operators on the right in eqn 
(8.118) has the form, for i — 1. 2, 3, 



= U + d<&jj (i) (8.119) 

where each J {l} is the operator AT evaluated for the special case in which n = e,. 
When terms in dt 2 and higher powers are dropped, eqn (8.118) can also be written 
as 

3 

K,[d<&h(t)] = U + (8.120) 

1=1 

The operators J (l) are called the fundamental generators of infinitesimal rotations 
or, more simply, the infinitesimal generators. The matrices corresponding to the jhO 
operators can be derived by setting n to be the vectors with components (1,0, 0), 
(0, 1, 0), and (0, 0, 1), respectively, in eqn (8.87). They are 

/0 0 0 \ / 0 0 1 \ /0 -1 0 \ 

J (1 > = j 0 0 -1 J (2) = 0 0 0 J (3) = 1 0 0 (8.121) 

\0 1 0 / \-l 00/ \0 0 0/ 

The infinitesimal generators do not commute. To see this, recall from eqn (8.89) 
that AfY = iixV and hence that the J (l) obey 

J' (1> Y = eixY V = e 2 xV J (3) \ = e 3 x V (8.122) 

Thus, the rule for the expansion of triple cross products, together with the rule of 
composition of linear operators, give, for any vector V, 

jd)jU)y _ g. x (e ; - x V) = e ; - (e, • V) - V (£/ • e ; ) (8.123) 

and 

= e ; . x (e/ x V) = e ; (e, • V) - V (e, • e,) (8.124) 

with the result that 

- J U) J <0 ) V = e, (e,- • V) - e, (e,- • V) = (e, x e ; ) x V 



(8.125) 
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Then using the expansion of (e,- x e,) from eqn (A. 14) gives 

3 3 

(j (0 j U) - j {j) j (i) ) y = J2 £i J k * k xY = J2 £ >j kJ(k)y (8 - 126) 

1=1 1=1 

Since Y is any general vector, eqn (8.126) implies that the commutators of two fun- 
damental generators are 



[j (i \ J U) ] c = - J U) J (i) ) = J2 Ei jkJ (k) 



1=1 



or, writing the three cases of interest explicitly, 

[j (l \ J {2) \ = J (3) [^ (3) , [j {2 \ J (3) 



= J a > 



(8.127) 



(8.128) 



These fundamental commutation relations control the structure of rotations in three- 
dimensional Cartesian spaces. Relations eqn (8.127) define what is called the Lie al- 
gebra of the rotation group. 

The commutations for the matrices J (l> must be the same as for the operators. 
These commutation relations can be read from eqn (8.127), or can be derived directly 
from eqn (8.121). 



8.17 Rotation with a Fixed Axis 

A time dependent rotation operator in general has a time varying instantaneous axis 
of rotation n(r). However, there is an important special case in which one assumes 
that the axis of rotation is constrained to be a constant independent of time, n(r) = n 
for all t where n here is assumed not to be time varying. In this special case, unlike the 
general case, the integral of the differential angle 40 defined in eqn (8.92) does have 
a simple geometric significance. It is the accumulated angle of the fixed-axis rotation. 

The operator for rotation by angle O about fixed axis n can be found in closed 
form. It will be denoted 7H[On], with R [On] for the corresponding matrix, and will 
be referred to as a fixed-axis rotation. 

The derivation of this operator begins with the differential equation, eqn (8.81). 
Making a change of variable from t to O, using the definitions 

r’ 40 

0=1 40= / (o(t')dt' and — = «(f) (8.129) 

Jo Jo dt 

derived from eqn (8.92), gives the differential equation, eqn (8.81), in the form 
47?/ o) 1 

—J— = — vmnm = monm (8.130) 

40 oj(t) 

where eqn (8.88) was used to get the final equality. 
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However, the assumed constancy of the axis unit vector n(f) = h implies that 
operator A f(t), defined at the beginning of Section 8.11, is also constant in time. Thus 
J\f (t) — M where .A" is a constant operator with a constant matrix composed of the 
components of ii, 

( 0 -«3 n 2 \ 

n-i 0 -ni (8.131) 

-n 2 n\ 0 / 



where 



ii = m ei + n 2 e 2 + «3 £3 



(8.132) 



is the constant axis of rotation. Thus eqn (8.130) becomes 



dnm 

d<£> 



= nh(<s>) 



(8.133) 



The solution to differential equation, eqn (8.133), with a constant operator such 
as J\f has already been discussed in Section 7.18. From eqn (7.131), it is 



7^(0) = exp (4>A0 or, in our preferred notation, 7^[$n] = exp (5>A f) (8.134) 



The exponential in eqn (8.134) can be written in a number of ways. A vector can 
be defined by $ = On and a vector with operator components by 

3 

J = J2*kJ' k) (8-135) 

k= 1 

where the fundamental infinitesimal generators from Section 8.16 have been used. 
Using the matrices defined in eqn (8.121), the matrix in eqn (8.131) and hence the 
corresponding operator M can be expanded as 

3 3 

N = J2 n k j (k) and AT = n k J {k) (8.136) 

k =\ k=\ 

The product OAf in eqn (8.134) can then be written as 

3 

OAf = Y^,®n k J (k) = 3>n- J= O • J (8.137) 

k= 1 

which allows eqn (8.134) to be written as 

7?.[On] = exp (on • j} = exp (o • j) . (8.138) 

An important special case arises when the fixed axis is chosen to be one of the 
coordinate unit vectors. Rotation by O about a coordinate axis e k becomes 

K[®h] = exp (ct>e A - • J) = exp 



(8.139) 
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8.18 Expansion of Fixed-Axis Rotation 

As discussed in Section 7.18, eqn (8.134) may be expanded in a power series, 

(<j>A f) 2 ($7 V) 3 

ft[<Pn] = exp (cD AO =U+ <t>Af + - + - + • • • (8.140) 

This power series may be written as the sum of a finite number of terms. 

Theorem 8.18.1: Expansion of Fixed-Axis Rotation 

A finite rotation by angle cD about a fixed axis n may be written as 

7^[<Dn] = exp (cPAO = U cos cp + AC sin cp + M (1 — cos <P) (8.141) 

with the corresponding matrix in the e, basis, 

R [On] = exp (ON) = U cos <P + N sin cp + M (1 — cos cp) (8.142) 

where operator A4 has a matrix M with matrix elements M, j — njnj. 

Proof: The evaluation of the power series in eqn (8.140) is facilitated by recursion 
relations for powers of Af. Direct matrix multiplication of eqn (8.131) by itself, using 
the fact that n\ + nr, + n^ = 1 for unit vector n, yields 

( n\ ni «2 «i«3 \ 

«2«1 «2 n 2 n 3 1 (8.143) 

«3«l riT,n2 n\ / 

and U is the identity matrix. Note that matrix M has the form M, j — njnj. 

The next power is then 

N 3 =N 2 N=(M-U)N = MN-N (8.144) 

But M N = 0, as can be seen from eqn (8.143) and the total skew-symmetry of eijk, 

3 3 3 3 3 

i“"i s = E MikNkj = EE nin k Skijni = «,• ^ ^ n k niSkij = 0 (8.145) 

k= 1 k= 1 1=1 1=1 1=1 

This result, together with eqn (8.144), gives N 3 = — N . Thus 

A^^M-U Af 3 = -Af M A = AT 3 AT — —Af 2 = — (M — U) (8.146) 

and so on through a repeating sequence. Collecting coefficients of U, Af, and A4 in 
eqn (8.140) gives 

/ cp 2 cp 4 \ / cp 3 \ /cp 2 cp 4 

exp(*A0=«(l-- + -- - )+Ar^-- + .. )+M(- -- + ••■ 

(8.147) 

Identifying the power series in cp with the power series of trigonometric functions 
gives eqn (8.141), as was to be proved. □ 
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For example, setting n = S 3 gives 

/ 100 \ / 0 - 10 \ / 000 \ 

R [Oe 3 ] = I 0 1 0 J cos <J> + I 1 0 OjsinO+|oOoJ(l — cos 4>) (8.148) 

\00 1 / \° 0 0 / \° 0 1 / 

which reproduces eqn (8.32) derived earlier for this special case. 

The trace of 7£[4>n] is easily obtained as the sum of the traces of the terms of eqn 
(8.141). It is 

Tr7^[4>n] = 2 cos <J> + 1 (8. 149) 

Note that the dyadic form of operator M has the form of a dyad M = nn with the 
consequence that MV = M • Y = n (n • V). 

The result of the finite rotation of a general vector Y by angle <t> about a fixed axis 
n can thus be written as 



V'*> = ft[<J>n]V = Vcos O + n x Ysin <t> + n (n • V) (1 — cos 4>) (8.150) 

where eqn (8.89) has also been used, with n constant. 

The geometric interpretation of eqn (8.150) is immediate. Use eqn (A.3) to write 
the original vector as a sum of vectors parallel and perpendicular to n, as in V = 
V || + Vj_. Then eqn (8.150) can be written in the same form, as V (R 1 = \ { ^ R) + V^, 
where 

V‘ S) = V|| and V { f ] = (V_l cos O + n x V_l sin <f>) (8.151) 

The original vector component Vy parallel to rotation axis n is unchanged by the 
rotation, as one would expect. The vector perpendicular to n has the same mag- 
nitude Vj_ as the original perpendicular vector Vj_, but is rotated by angle <l> in the 
right-hand sense about axis n. 



n x Vj_ 




Fig. 8.6. Illustration of eqn (8.151). The component Vj^ perpendicular to n is rotated by angle 
4> to give . The component parallel to n is not changed. 
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Since rotation operators obey eqn (8.22), the inverse of a fixed-axis rotation is 
T^-fOii ] - 1 = 7H[On] T . The expansion eqn (8.140) gives 

■ft[cDn] T = {exp (cDA 0} T = exp (<J>A/' T ) = exp (-OTV) = ft[-<Dn] (8.152) 

since J\f is anti-symmetric. Thus, as one would expect, the inverse is a rotation by the 
same angle about an oppositely directed axis, 

7H[On ] _1 = 7^[— On] (8.153) 

8.19 Eigenvectors of the Fixed-Axis Rotation Operator 

The eigenvectors and eigenvalues of the fixed-axis rotation operator 7\l[On] are easily 
derived. From eqn (8.134) we know that 7H[On] = exp (OA0, where M is a real anti- 
symmetric operator associated with unit vector n by eqns (8.131, 8.132). 

To begin, we solve the eigenvalue problem for N by noting that this operator 
is identical to the W treated in Section 7.13 except for the substitution of n for go. 
Setting co = 1 in eqn (7.89) since n is a unit vector, gives the eigenvalues of operator 
J\f as 

4 N) = / Xf = -i 4 N) =0 (8.154) 

with corresponding normalized eigenvectors 

V (1) = (a - ib) /\/2, V (2> = (a + /'bj /V2, and V (3) = n (8.155) 

where a is some real unit vector perpendicular to n but otherwise arbitrary and b = 
n x a is also a real unit vector, perpendicular to both a and n. 

By eqn (7.121) of Section 7.17, as discussed also in Section 7.18, the eigenvectors 
of 7H[<J>h] = exp (9N) are the same as those of M, and the eigenvalues are exponential 
functions of those in eqn (8.154), 

Aj = exp (i 4>) a 2 = exp ( — f <I>) A 3 = exp (0<J>) = 1 (8.156) 

The dyadic IR[On] corresponding to 'R,\ 4m] can be obtained in eigen-dyadic form from 
eqn (7.113) of Theorem 7.16.1. It is 

3 

R[On] = (8.157) 

k= 1 

where the eigenvalues A* are from eqn (8.156). 

The eigenvalue problem for 7^[<l>h] is now completely solved. As discussed in Sec- 
tion 7.13, making different choices of arbitrary unit vector a is equivalent to multi- 
plying the first two eigenvectors by exp (iu) and exp (—ia), respectively, where a is 
some real number. This is only a trivial change, since in any case eigenvectors are de- 
termined only up to a multiplicative constant of modulus unity. As proved in Lemma 
7.16.2, such a multiplication also makes no change in the dyadic eqn (8.157), since 
the exponential factors cancel. Thus, in spite of its appearance, eqn (8.157) is in fact 
independent of the choice of a. If written out in terms of a, b, and n, eqn (8.157) will 
be seen to reduce to eqn (8.150) which depends only on and n. 
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8.20 The Euler Theorem 

We have now discussed two different types of rotations. The first, in Section 8.10, 
happens when a rigid body is rotated in a general way during a time t, first about one 
axis and then about another, and so on. The end product of all of this various motion 
is still a rotation, however. The operator TZ(t) at time t is an orthogonal operator. 

The other type is what we have called a fixed-axis rotation, discussed in Section 
8.17. In this case, the rigid body is rotated by an angle $ about an axis that does not 
change, somewhat as if the rigid body were mounted on a lathe. 

The Euler Theorem proves a result that may seem obvious: Any general rotation of 
the first type could have been accomplished by some fixed-axis rotation of the second 
type. This does not mean that it necessarily was accomplished by a fixed-axis rotation, 
only that it could have been. If one starts with some standard orientation of a rigid 
body at time zero and rotates it during time t in a general manner, the final orientation 
could as well have been produced by starting from the same standard orientation and 
rotating by some angle <t> about some fixed axis n. 

Theorem 8.20.1: The Euler Theorem 

For any general proper, orthogonal operator TZ, there exist a fixed axis n and an angle 3> 
in the range 0 < < jt such that 

7£[ct>h] = TZ (8.158) 

Proof: We show that the dyadic form of a general TZ is identical to the dyadic form 
of some fixed-axis rotation 72.[G>n]. Since two operators with identical dyadics are 
themselves identical, this will prove the theorem. 

The first step is to find the eigenvalues of a general rotation TZ. Use eqn (8.22) to 
write 

(1Z-U)TZ t = U-TZ 1 = -(TZ-Uf (8.159) 

and then take determinants of both sides, 

det (TZ — U) det TZ = (— l) 3 det (7 Z-U) (8.160) 

where we used det7H T = det TZ, and det (aTZ) — a 3 det TZ for three-dimensional opera- 
tors. Since det TZ — +1 for proper orthogonal operators, the result is det (TZ — U) — 0. 
which, according to eqn (7.86), shows that +1 is an eigenvalue of TZ. Call this eigen- 
value a 3 = 1. To find the other two eigenvalues, we use eqns (7.111, 7.112) of Sec- 
tion 7.15 relating the determinant and trace to the eigenvalues of TZ. Since the trace 
of TZ is defined as the sum FxTZ = (R\ \ + R22 + R33) which is a real number, the 
sum (A.i + ^2 + A.3) = (7.J + A. 2 + 1) must be real, which implies the relation between 
imaginary parts 3 (Xf) — —3 (A.i) . This, together with 1 = det 7^ = A 2 A .3 = ki^Cl), 

implies that 

ki=exp(i<l>) 7.2 = exp (— /<t>) 7.3 = 1 (8.161) 

where <l> is some real number. The value of this number can be found from 

(Rn + R22 + R33) — Tr7 Z — ki + A .2 + A, 3 = 1 + 2 cos <I> (8.162) 

where O can be restricted to the range 0 < $ < n. The three eigenvalues of a general 
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proper orthogonal operator 7 Z are thus determined uniquely. 

The (real) eigenvector V (3) corresponding to eigenvalue +1 is found by setting 
A 3 = 1 in the eigenvector equation, eqn(7.86), and solving for the eigenvector. The 
equation to be solved is 

( U-U)\ O) = 0 , or (R - U)[V (3) ] =0 (8.163) 



in matrix form. 

It follows from eqn (8.163) in the form R.\ (2l) = V (3) that the normalized eigen- 
vector n = V ( 'h / V l3> is not changed by 1Z. Thus n will be along the axis of rotation of 
TZ. Only one rather trivial difficulty remains, the choice of direction for n. The eigen- 
vector equation, eqn (8.163), only determines real unit vector n up to a factor ±1. 
It is necessary to compare the action of TZ on some vector not parallel to n. If that 
rotation is not in a right-handed sense about n, the direction of n must be reversed. A 
unique axis direction and angle <l> are thus obtained, with positive angle O meaning 
rotation in a right-handed sense about n. 

We now find the other two eigenvectors of TZ. Since TZ is a normal operator with 
three distinct eigenvalues, Lemma B.26.2 proves that it must possess three eigenvec- 

„ (£)* /, (/) ^ (3) 

tors which are orthogonal in the extended sense Y • Y — hi- Setting n = V , 
and recalling that n is real, the other two eigenvectors must be composed of real and 

» (i) 

imaginary parts, both of which are perpendicular to n. Setting V = a ib where a 
and b are unknown real vectors perpendicular to n, the first eigenvalue equation is 

~(i) «m 

TZ\ = AiV (8.164) 



Since R is a real operator and A 2 — A.*, the complex conjugate of eqn (8.164), 



n\ 



ip* 



= A*V 



( 1 )* * ( 1 )* 
= a 2 v 



(8.165) 



~ ( 2 ) ( 1 )* 

implies that V =V = a + /b is the eigenvector corresponding to ki. 

The orthogonality of these two eigenvectors then implies the vanishing of the real 
and imaginary parts of the expression 



(1)* ^ (2) / O o\ 

0 = V • V = (a + ib) • (a + ib) = (a 2 - b 2 ) + i (a • b) (8.166) 

which requires that vectors a and b must be orthogonal and have the same magnitude. 
The vector b must therefore be b = n x a. (The other possible choice b = — n x a would 
have the effect of making positive $ mean rotation about n in the left-handed sense, 
whereas n has already been chosen above so that R, produces rotation in the right- 

/V A ( Ic ) 

handed sense.) Normalizing the eigenvectors of R using V -V =1 for k — 1, 2, 3 
shows that the eigenvectors of 1Z may be written as 

V (1> = (a - ib) /V2, V <2) = (a + ib) /V2, and V <3) = n (8.167) 



where a and hence b = n x a are unit vectors. 
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The dyadic form of normal operator 1Z is thus given by eqn (7.113) as 

3 

(k) * (k)* , 

Y X k Y (8.168) 

k = 1 

where the eigenvalues are those in eqn (8.161) above, with the angle <l> found in eqn 

(8.162) , and eigenvectors are those in eqn (8.167) with the axis n found from eqn 

(8.163) . 

If we put the same angle $ and same axis h into eqn (8.157) of Section 8.19, we 
obtain a dyadic Rfcbn] which is exactly the same as eqn (8.168), except for a possi- 
bly different choice of arbitrary unit vector a. But, as discussed in Section 8.19, the 
dyadic is independent of the particular choice of a. Different choices are equivalent 
to multiplying eigenvectors by phase factors of modulus unity that cancel from the 
dyadics. The dyadic of 1Z is thus identical to that of 7?. [On], But two operators with 
identical dyadics are themselves identical. Hence 7^.[On] = 1Z, which proves the Euler 
Theorem. □ 

8.21 Rotation of Operators 

Suppose that W = TY where T is a linear operator and Y a general vector. Suppose 
a rotation 7 Z to act on both V and W giving = 7ZY and W {R) — 7ZW. Then we 
can find a linear operator tF (R ^ that will map V ( ^’ into W (i?) as in W (R) = 

To do so, write 



W (S) = nw = 1ZTY = KTVJlZY = T {R) Y iR) (8.169) 

which leads to the definition 

T (R) =1ZT1Z t (8.170) 

We refer to T {R) as a rotated operator, since its action on the rotated vectors mimics 
that of the original operator T on the original vectors. 

8.22 Rotation of the Fundamental Generators 

The rotated operators of the fundamental infinitesimal generators defined in 
Section 8.16 are of particular interest. They are 

J (k)(R) = lZJ (k) TZ J = J U)R ik = J2 R h j(l) (8.171) 

l=i i=i 

where Ri k are the matrix elements of rotation 1Z. 

To prove eqn (8.171), we let jTDW = 1ZJ^1Z T act on a general vector V. The 
result is 

j(k)(R)y = Ty = ft (£* x ^ T V)) (8.172) 

where eqn (8.122) was used. The invariance of cross products under proper rotation 
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from eqn (8.39) then gives 



J (k)(R) V = n (e* x (7 Z t V)) = (7ie k ) x (TZTZ J Y) = (Ue k ) x Y 

Inserting a resolution of unity U = Jfd=i ©/©/ to expand 72.e/t as 

3 3 

fte* = U • (Ke k ) = J2 W *' R ik 

i=i 1=1 



(8.173) 



(8.174) 



eqn (8.173) becomes 



3 3 

J mR) v = (nh) X Y = £ /?,* e/ X V = £ RikJ (l) y ( 8 . 175 ) 

/=i i=i 

Since V was an arbitrary vector we get finally the operator equality 
j(k)(R)_ Y^ l=l J^Rik, which is the same as eqn (8.171). 

Note the similarity between eqns (8.174, 8.171). The infinitesimal rotation gener- 
ators transform under rotation in the same way as the Cartesian basis vectors e* 
do. 

8.23 Rotation of a Fixed-Axis Rotation 

The rotation of a fixed-axis rotation operator may now be derived. Suppose the oper- 
ator T in eqn (8.170) to be a fixed-axis rotation 7^[$n] discussed in Section 8.17. Let 
this fixed-axis rotation map a general vector V into another vector W so that 

W = TUf't’n] V (8.176) 

Now suppose some rotation Ti (not usually the same as 7H[ | t | n]) is applied to both V 
and W, to give V (S) = TZY and W w = 7£W. We expect intuitively that a fixed-axis 
rotation by the same angle 4 > but about a rotated axis h (R) — Tin should map V (S) 
into W (R) as in 

W {R> = ^[Oii w ] Y {R) (8.177) 

In effect, the original rotation should itself be rotated, its fixed axis changed from n 
ton (K) . 

We now prove this important result formally. 

Theorem 8.23.1: Rotation of Fixed-Axis Rotation 

If 7H[4>h] is a fixed-axis rotation operator, and if Ti is some other rotation, then 
^[4>n] w = KK[®n]K T = ^[4)n (i?) ] 



where h iR) — Tin. 



(8.178) 
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Proof: The proof begins with eqn (8.140) which gives 

R[®n] {R) = RR[®h]R T = R exp (OA0 R J 

, r _ (OA 0 2 CO A / - ) 3 t 

= rur j + n ( 4 > 7 V) -k t + re - - re T + re re T + ■■■ (8.179) 

where the linearity of R has been used. Noting that 

RJ\f k R T = RAfAf ■ ■ ■ J \fU T = RAfR T RJ\fR T ■ ■ ■ RAfR T = (RAfR J ) k (8. 180) 
where unity operators in the form U — R T R were inserted between N factors, gives 
■ft[cDn] w = R (exp (<J>A0) re T 

T . . T . (oreAC-re 7 ) 2 (oreAC-re 7 ) 3 

= RUR 1 + (®RAfR J ) + - + + ■■■ 

= exp(4>^7V^ T ) (8.181) 

Now, by eqns (8.137, 8.171), 

= J2 n k RJ {k) R J = J2 nk H J ° )Rlk = J2 (8.182) 

1=1 1=1 /= l i=i 

/ D\ O 

where we have defined n\ — X!i=i ^/l«l which, by eqn (8.30), is equivalent to 
n< A) = refi. Thus, again using eqn (8.137), eqn (8.181) becomes 

re[<J>n] w = exp J2 n, R) J {l)S j = re[<J>n W ] (8.183) 

as was to be proved. A rotated fixed-axis rotation is indeed a rotation about a rotated 
fixed axis! □ 

8.24 Parameterization of Rotation Operators 

A general rotation R would appear at first sight to require nine parameters, the nine 
matrix elements Rjj, to define it completely. But these nine matrix elements are not 
independent, being constrained by the six independent conditions coming from the 
orthogonality condition eqn (8.22). A general rotation can be completely defined by 
the values of only three independent parameters. 

One obvious parameterization of R would make use of the Euler Theorem of 
Section 8.20. As we saw there, any general R determines the unique angle O and 
axis n of an equivalent fixed-axis rotation R — R\ <t>n], Since it is a unit vector, the 
fixed-axis n can be parameterized by two numbers, its components in spherical-polar 
form, as in 

n i = sind„ cos <p n «2 = sin 0 n sin <p n 173 = cos0„ (8.184) 

where 9„ is the angle between n and the S 3 axis, and <p n is the azimuthal angle. Thus 
a general rotation R can be uniquely parameterized by the three numbers 4>, f) n . cp n . 
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8.25 Differentiation of Parameterized Operator 

When the rotation is varying with time, then the three parameters introduced in Sec- 
tion 8.24 will become time dependent also, O(f), 0 n (t), (f> n (t), which we may write 
as 

n(t) = ll[<t>(t)h(t)] (8.185) 

At time q, the rotation 1Z(t\) would be associated with the axis n(fi ) and angle <t> (r ] ) of 
the fixed-axis rotation that would carry the rigid body from some initial orientation 
to the rotated position at time t\ . At a later time t 2 , the rotation A) ( f 2 ) would be 

associated with a different axis nfe) and angle 0(72) of a different fixed-axis rotation. 

Each rotation A)[0(r)n(r)] is about a fixed axis, but the required fixed axis is varying 
with time! 

The angular velocity oo (t) of such a time-varying rotation operator can be written 
in terms of the time derivatives of O(f) and n(t). 

Theorem 8.25.1: Angular Velocity of Parameterized Rotation 

If we use the parameterization ofeqn (8.185) to write a time-varying rotated vector as 

V (S, (f) = 72,[0(f)n(f)] V (8.186) 



then the time derivative can be written as in eqn (8.76), 



d\ (R) (t ) 



= co(f) X V (R \t) 



(8.187) 



where the angular velocity vector oo can expressed in terms of the parameters O, n and 
their derivatives 6 = d^/dt and dh/dt as 



d n „ dn 

o o(f) = On + sin O h (1 — cos O) n x — 

dt dt 



(8.188) 



Proof: To establish eqn (8.188), we begin by writing eqn (8.186) in the form 



V w (f) = 7£[0(f)n(f)]V = (u + sin OAf + (1 - cos O) Af 2 \ 



(8.189) 



where the expansion of eqn (8.141) was used, with the first of eqn (8.146) used to 
substitute A4 — U + J\f 2 . Taking the time derivative of eqn (8.189) gives 



d\ {R Ht) 



1 , r r0\ dAf ( dM r dAf\ 

1 ( cos OA f + sin OA ) + sin O h (1 — cos O) ( A f + N ) 

V J dt \ dt dt J 



dt ) j 
(8.190) 



From eqn (8.153), the inverse of eqn (8.189) is 



^[— 0(f)n(f)] V w (f) = {u — sin OAf + ( 1 — cos O) Af 2 ^ (f ) (8.191) 



Since n is a unit vector with n • n = 1 it follows that n • (dh/dt) = 0. Since, for any 
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arbitrary vector A, 



AfA — hxA and so — A= — xA (8.192) 

dt dt 

expanding the cross products using the rule of triple cross products gives 

Af^^-Af A = 0 and hence Af^-Af — 0 (8.193) 

dt dt 

Substituting eqn (8.191) into eqn (8.190), and using eqn (8.193) as well as the iden- 
tities in eqn (8.146) to simplify, gives 



d\ (R \t) 

dt 



■ , r d.V / dJsf dAf 

4>A( + sm4> h (1 — cos€>) I A f 

dt V dt dt 



A^jv w (0 (8.194) 



Since, for any arbitrary vector A, 



r dAf dAf 



dt 



dt 



AT^- - ^AT A = h x — x A 



dn\ 



dt J 



(8.195) 



eqn (8.194) is equivalent to 

d\ (R Ht) 



dt 



— ( 4>n + sin <t> — + (1 — cos 4>) n x — ) x y iR \t) (8.196) 

d t d t 



which completes the derivation of eqn (8.188). □ 

An important consequence of eqn (8.188) is that if we happen to have a very 
simple time-dependent rotation, one with a time-varying 4>(f) but a fixed axis n which 
does not vary with time, then 

dvi. 

— = 0 and so «(f) = On (8.197) 

dt 



8.26 Euler Angles 

For many problems, particularly in rigid body dynamics, the parameterization of a 
rotation by n and <1> as in Section 8.24 is not the most convenient one. An alternate 
parameterization uses the three Euler angles a, /i, y defined by 

not, p, y] = K[ah] mph\ myh] (8.198) 

The definition in eqn (8.198) consists of three simple rotations about fixed coordinate 
axes: First by y about the £3 axis, then by /J about the e 2 axis, then by a about the S 3 
axis again. Since these are rotations by finite angles, their order is quite important, as 
we will see. 

We prove the somewhat surprising fact that the product of these three rotations is 
capable of reproducing any rotation whatsoever. 
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Fig. 8.7. Rotation of V into by the Euler angles. First, a rotation by y about the 63 axis 
rotates V into V fl) . Then a rotation by f) about the e 2 axis rotates V (1) into V (2) . Finally a 
rotation by a about the S3 axis rotates V f2) into the final vector \^ R h 

Theorem 8.26.1: Adequacy of Euler Angles 

For any proper rotation 1Z, there are three angles a , p, y in the ranges —jt < a < jt, 
0 < P < 7t, —jt < y < jt such that 



TZ[a , p,y] = 1Z 



(8.199) 



Proof: We proved in Theorem 8.20.1 that for every proper rotation 'R, there are a 
unique axis n and angle 0 < <t> < jt such that 1Z = 7?.[On]. So here we only need 

to prove that, given any fixed-axis rotation, there are three angles a, p, y such that 

TZ[a, p, y] = 7S[On]. We begin by writing the matrices of each factor of eqn (8.198), 

( cos y — sin y 0\ / cos/6 0sin/S\ 

siny cosy 0 ) R[£e 2 ] = j 0 1 0 ) 

0 01/ \ — sin/10 cos p ) 

( cos a — sin a 0\ 

sin a cos a 0 I (8.200) 

0 0 1 / 



Multiplication of these three matrices then gives the matrix of 7Z[a, p, y] as 
R [a,p,y]= R[«e 3 ] R[/Je 2 ] R[ye 3 ] = 

( (cos a cos p cos y — sin a. sin y) (— cos a cos p sin y — sin a cos y) cos a sin p 
(sin a cos p cos y + cos a sin y) (— sin a cos p sin y + cos a cos y) sin a sin p 
— sin p cos y sin p sin y cos p 

( 8 . 201 ) 

We compare this matrix to eqn (8.142). Comparing the 33 elements of the two matri- 
ces gives 

cos p — + ^1 — n^j cos <J> (8.202) 

The components of unit vector n obey n\ + n 2 + w 2 = 1 . It follows that ( 1 — n 3 ) — 0 and 
hence that, as 4> varies, the right side of eqn (8.202) has a maximum value of +1 and 
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a minimum value of (— 1 + 2 > — 1. It follows that eqn (8.202) defines a unique ft 
in the range 0 < ft < n. 

Some special cases must now be considered. First, when n\ — n 2 = 0 and hence 

= 1, the rotation is purely about the £3 axis. In this case, eqn (8.202) requires that 
y6 = 0. The other two angles are not separately determined, but may have any values 
such that (a + y) — 4>. 

Second, if 4> — 0 the rotation is the trivial unity rotation, regardless of the value of 
n. Then eqn (8.202) requires that ft — 0. The angles a and y are again not separately 
determined but may have any values such that (a + y) — 0. 

Having treated the cases = 0 and n\ — 1 separately, we will henceforward 
assume that <t> > 0 and "1 < L With these assumptions, we cannot have ft = 0, 
for that value would reduce eqn (8.202) to (1 — n^)( 1 — cos O) = 0, which would be 
impossible. The case ft = jt is possible, however. Again using eqn (8.202), it can arise 
only when 773 = 0 and <& = jt. As can be seen by writing out R [a, jt, y] — R [jrn] with 
«3 = 0 assumed, the a and y can then have any values such that (y — a) = 6 where 
0 is some unique angle in the range — 7r < 9 < jt defined by the pair of equations 
sind = 2 n\ii 2 and cosd = (ny — n\). 

The undetermination of a and y for certain special values of <f> and n is similar to 
the situation in spherical polar coordinates, where the polar angle 0 is undetermined 
when 9=0. Here, as in the polar angle case, if <f> and n are continuously differentiable 
functions of some parameter, the values of a and y at the indeterminate points can 
be determined from the condition that a , ft, y also vary continuously. 

We now find unique values of a and y when <i> > 0, < 1 and sin ft > 0. With 

ft known from eqn (8.202), compare the 31 and 32 entries of the two sides of the 
matrix equation R [a, ft, y] = R [On] to obtain 



cos y = 



sin y = 



112 sin <J> — 77371 1 (1 — cos <J>) 
sin ft 

77i sin <J> + 773772 (1 — cos <J>) 
sin ft 



Similarly, comparison of the 13 and 23 entries gives 



772 sin O + 77 1 77 3 (1 — cos O) 

cos a = — 

sin /1 

—771 sin <J> + 77 2 «3 (1 — cos <t>) 

sin a = 

sin ft 



(8.203) 

(8.204) 



(8.205) 

(8.206) 



The sum of the squares of the right sides of eqn (8.203) and eqn (8.204) equals one. 
Thus a unique angle y in the range —it < y <jt is determined. Similarly, eqns (8.205, 
8.206) determine a unique a in the range —n < a < jt. 

The angles a. ft, y are now determined and five of the matrix elements matched. It 
remains to prove that the 11, 12, 21, and 22 elements of the two matrices are identical 
with these same choices of a, ft, y. This algebraic exercise is done in Exercise 8.12. 
Thus 1Z = 7H[<l>h] = 7 Z[a, ft, y], as was to be proved. □ 
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The trace of 1Z[ct , ft, y] is the sum of the diagonal terms of eqn (8.201). It can be 
simplified to 

TtR[a, ft, y] = cos ft + (1 + cos ft) cos(a + y) (8.207) 

The definition eqn (8.198) rotates first by angle y about £ 3 , then angle ft about 
£2, then by a about £3 again. Repeated use of eqn (8.178) can be used to derive the 
following remarkable result. 

Theorem 8.26.2: Euler Angles in Reverse Order 

The rotation lZ[a, ft, y] defined in eqn (8.198) can abo be produced by three rotations 
that use the angles a, ft, y in reverse order, provided that each successive axis of rotation 
is changed to reflect the effect of rotations already performed, 



K[a, ft, y] = H[ae 3 ] K[fte 2 \ K[yh] = H[yzf *] K[ft y] 7£[a£ 3 ] (8.208) 

where 

£j S) = TZ[fty] £3 = K[a, ft, y] £3 and y = ft[a£ 3 ] £2 (8.209) 

Proof: The first equality in eqn (8.208) simply repeats the definition eqn (8.198). 
The proof of the second equality is left as an exercise. □ 

Note that the equivalence of the two forms in the first of eqn (8.209) follows from 
the fact that £3 = n [a £ 3 ] £3 and e ( 3 R) — TZ[yt R) ] £^ ) . A unit vector is unchanged by a 
rotation of which it is the axis. 

8.27 Fixed-Axis Rotation from Euler Angles 

In Section 8.26 we began with a fixed axis rotation 7^[<f>n] and derived the three Euler 
angles a, ft, and y. The inverse problem is also of interest. 

We are given the three Euler angles a, ft, and y wish to find the equivalent finite 
rotation with 

n[<t>h] = n[a,ft,y] ( 8 . 210 ) 

The angle is found by solving for 0 < < jt in the expression 

2 cos <I> + 1 = cos ft + (1 + cos ft) cos (a + y) ( 8 . 211 ) 

that is found by equating the traces in eqns (8.149, 8.207). 

The components of the axis come from a straightforward application of the eigen- 
vector equation, eqn (8.163), using the matrix R [a, ft, y] from eqn (8.201). The nor- 

~ ( 3 ) 

malized eigenvector corresponding to A 3 = I is V , which is the rotation axis. When 
ft — 0 and (a + y) — 0, the rotation is the trivial identity transformation. In that case, 
the axis n is undetermined since any vector is an eigenvector of U with eigenvalue 
one. When ft — 0 and (a + y) f 0, the components of this axis vector are (0, 0, ±1) 
with the sign depending on the quadrant of (a + y). When ft f = 0 but (a + y) = 0, the 
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components of the axis vector are (— sin a, cos a, 0). When f ^ 0 and (a + y) ^ 0, the 
components of the (not yet normalized) axis vector are 



(3) 

V { = (1 — cos/!) (cosa — cos y) 


(8.212) 


vY’ ) = (1 — cos/S) (sin a + siny) 


(8.213) 


Vf = sin/i (1 — cos (a + y)) 


(8.214) 


The normalized rotation axis is then n = yUI/yU) where V (3) = Yll=i T, <3) e; and 
V <3) is its magnitude. Just as in the Euler Theorem proof, the two sides of eqn (8.210) 



must be applied to some vector not parallel to n and the results compared. If they fail 
to match, n must be replaced by — n. 



8.28 Time Derivative of a Product 

It is useful to have general formulas for the time derivative of a rotation operator that 
is the product of time-dependent rotation operators 

Tl(t) = TZ a (t)TZ b (t) (8.215) 

Theorem 8.28.1: Angular Velocity of a Product 

The angular velocity associated with the rotation 7 Z(t) defined in eqn (8.215) is 

w(f) = u) a (t ) + lZ a (t) M h (t) (8.216) 

where M„it), Mbit) are the angular velocities associated with !Z a it), lZ b it), respectively. 
Proof: Using the product rule, 



dlZit) 

dt 



dK " , ' ] -mn + n,(n dKt,n 



dt 



dt 



(8.217) 



and hence, by eqn (8.70) of Section 8.10, the anti-symmetric operator associated with 
rotation lZ(t) is 



dlZit) T dlZ a it) , .t dlZbit) . j 

Wit) = — = —^Kbit) (n a it)n h it)Y +n a (t)—^ (n a it)n b (t)y 

(8.218) 

Since ( !Z a it)lZbit )) = lZj } it)lZ]jit) and each operator is orthogonal, this expression 
reduces to 



dlZait) dlZbit) T T T 

Wit) = izlit) + iz a it)—^izlit)izlit) = Wait) + n a (t) w b (t) n T a (t) 

(8.219) 

where W a (t), Wbit) are the anti-symmetric operators associated with rotations lZ a it), 
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lZb(t) respectively. Applying this to a general vector A and using eqn (8.75), 

w (?) x A = w a (t) x A + 7 Z a (t) Jwfe(t) x ^7\^(?)A^ j (8.220) 

Then eqn (8.39) of Section 8.6 and the orthogonality of the operators gives 

«(?) x A = M a (t) x A+ |7 x (jl a (t)TlJ(t) A^} 

= w fl (?) X A+ (7 l a (t) x A — (w a (?) + 7£ a (?) co*(r)) x A (8.221) 

Since vector A is arbitrary, this implies eqn (8.216), as was to be proved. □ 

The above theorem can be applied repeatedly to obtain the derivative of a product 
of any finite number of factors. Thus the angular velocity associated with 

TZ(t) — lZ a (t) IZb(t) TZ c (t) ■ ■ ■ IZy(t) lZ z (t) (8.222) 

is 



w(f) = <•>„(?) + TZait) Mbit) 

+ (Ua(t)Ub(t)) W c (f)4 h (7 Z a (t)Ub(t)7lc(t) ■ ■ - llyit)) (ti z (t) (8.223) 

in which each angular velocity is modified by all rotations that are applied after it. 

8.29 Angular Velocity from Euler Angles 

Time-dependent rotations can be parameterized by time-varying Euler angles. The 
angular velocity vector of the rotation can then be obtained as a function of the Euler 
angles and their first time derivatives. 

Theorem 8.29.1: Angular Velocity from Euler Angles 

Let a time-varying rotation be defined by 

H(t)=Tl[a(t),l3(t),Y(t)]=moi(t)h] mmh] K[y(t)h] (8.224) 

where a (t), ( t ), and y (?) give the three Euler angles as continuous, differentiable 

functions of the time. Then the vector w (?) associated with the time-dependent rotation 
in eqn (8.224) is 

(0 (?) = a £3 + f y(t) + y (?) (8.225) 

where the dots represent time derivatives and e { 3 R \t) — TZ[a(t), fit), /(?)] £3 and y (?) = 
7H[a(?) 63] e2 are the same vectors found in eqn (8.209) above. 

Proof: Each of the products on the right side of eqn (8.224) is a rotation about a 
fixed coordinate axis. Hence, by eqn (8.197), the angular velocity vectors associated 
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with rotations ^[afl) e 3 ], lZ[/3(t)e2], 7Z[y(t)^2,] are u 0 .(f) = 0 - 63 , u>p(t) = jiei, and 
(o y (r) = y S 3 , respectively. Then eqn (8.223) gives 

w (0 = a e 3 + /3ft[a(f) e 3 ] e 2 + yft[or(f) e 3 ] K[P(t) e 2 ] e 3 (8.226) 

Since rotation about a fixed axis does not change that axis vector, 63 = TZ[y(t) e 3 ] e 3 
and hence 



^[«d)e 3 ] mmh]h = n[a{t)h] nmh\ m Y {t)h]h 

= ft[«(0, /HO, K(01 e 3 = e< S) (r) (8.227) 

Thus, using the definitions above, eqn (8.226) becomes identical to eqn (8.225), as 
was to be proved. □ 

The expression for « in eqn (8.225) is not yet in a useful form. It needs to be 
expressed in terms of a single set of basis vectors. Denoting the components in the e* 
system by co k (t) gives 



w(r) = + co 2 (t)e 2 (t) + coi(t)h(t) (8.228) 

where, using eqn (8.225), co k (t) — e* • w(f) may be written as 

ojk(t) — a e k ■ e 3 + yS h ■ ( K[a(t ) e 3 ] e 2 ) + y h ■ t) 

= aS k3 + $R k2 [a(t)e 3 ] + yR k3 [a(t), P(t), y(t)] (8.229) 

Writing the components out explicitly using eqns (8.200, 8.201) gives 

&>i(f) = — p sin a + y cos a sin /I (8.230) 

a> 2 (t) — P cosa + y sina sin/S (8.231) 

cojit) — a + y cos/I (8.232) 

where the Euler angles and their derivatives are all functions of time. 

8.30 Active and Passive Rotations 



We now return to the general treatment of the rotation operator. The rotation operator 
1Z can be used either actively or passively. The active use, which is the only one we 
have discussed to this point, transforms each vector V into a rotated vector V , A,) = 
1ZY. Although we have not emphasized the point, it is implicit that the coordinate 
system unit vectors e,- are not changed by this active rotation. 

The passive use of 1Z makes the opposite choice. The unit vectors e, are rotated to 
form a new coordinate system, which we will denote e- and define, for i — 1, 2, 3, as 

e' = = IZt, (8.233) 

The vector V, however, is not changed by passive rotations. In a sense, active rotations 
rotate the world while passive ones rotate the observer. Note that the same rotation 
operator 1Z is used in both cases, but used differently. 
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Fig. 8.8. On the left is an active rotation. The vector V is rotated into the new vector V ( ^ but 
the basis vectors e; do not change. On the right is a passive rotation. The basis vectors e,- 
are rotated into new basis vectors e; but the vector V does not change. 

We now consider passive rotations. We will refer to the original unit vectors e,- as 
the old system or the original system, and often denote it by the letter o. The rotated 
unit vectors e- will be called the new system or the rotated system, and will often be 
denoted by the letter o' . 

We assume here and in the following sections that the e,- are a right-handed or- 
thonormal system and 1Z is a proper rotation operator. It follows from Definition 8.6.2 
that the new system of unit vectors is orthonormal and right-handed, 

e- • e'j = e,- • e,- = <5; ; - and ej x e 2 = e 3 (8.234) 

The new unit vectors can be expanded in terms of the old ones by eqn (8.33) with the 
identification e- = e ( <S) , 

3 3 3 3 

e; = E $ • 3) = E ( £ ; • = e tj R » - E R J& (8 - 235} 

j= 1 7=1 7=1 7=1 

8.31 Passive Transformation of Vector Components 

An important thing to notice is that, although the vectors V do not change in passive 
rotations, their components do change. An unchanged vector V can be expanded in 
either system 

3 3 

E v & = v = E v & (8 - 236) 

i'= 1 i=l 

where the components in the two systems are 

Vi = e, ■ V and Vj = e- • V (8.237) 

The new components V{ = V ■ e- are different from the old ones Vj = V • e, even 
though V is the same in both cases, because the unit vectors are different. We can 
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denote the two alternate expansions of the vector V into components by the notation 



V:(Vi,V 2 ,V 3 ) 0 (8.238) 

V : {Vi Vi V') f/ (8.239) 

We avoid using the equal sign here. A vector V is not equal to its components, rather 
it is represented by its components in a particular reference system. As we see, the 
components in the o' system will be different from those in the o system, even though 
the vector V is the same in both cases. 

It follows from eqn (8.235) that the components of V in the two systems are re- 
lated by 

/ 3 \ 3 



V/=3-v= ■y = J2 R Ji v i 



7=1 



7 = 1 



(8.240) 



This relation can also be written in matrix form. If we denote by [V] the column 
vector of components in the o system, and by [V'] the column vector of components 
in the o’ system, then eqn (8.240) can be written as 



[V']= R T [V] 



(8.241) 



8.32 Passive Transformation of Matrix Elements 

Just as a vector Y has different components in the o and o’ systems, an operator 
B will also be represented by different matrix elements (which we might consider 
as the “components” of the operator) in the two systems. Since we are considering 
passive rotations now, the operator itself if not changed by the rotation, but its matrix 
elements are changed. In the two systems, we have 

Bij = e; • Be, and B' u = e' • Be'- (8.242) 

which are related by 

B'ij = e' • Be' = Rj k e k ) ■ B (e = EE *tf*w*V ( 8 - 248 ) 
u=i / \/=i / 1=1 /= 1 



which can be written as the matrix equation 

B'=R t BR (8.244) 



We say that the operator B is represented by the matrix B in the o system and by the 
matrix B ’ in the o’ system. 

It follows from eqns (8.235, 8.243) that the dyadic associated with operator B can 
be written out in either system, as 

E E * = b = EE^ b 'S ( 8 - 245 ) 

7 = 1 7 = 1 < = 1 7 = 1 

Of particular interest is the unit operator U. It follows from the orthogonality of 1Z 
and the transformation rule eqn (8.240) that its matrix elements are the same in any 
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system, 

Uij - Sij = U'jj (8.246) 

Thus the resolution-of-unity dyadic, the dyadic associated with U, has exactly the 
same algebraic form in the two systems, 



eiei + e2e2 ■ 



e3e3 = U = 



■ e 2 e 2 ■ 



/v/ /s/ 

■ e 3 e 3 



(8.247) 



Multiplying a vector by the expansion of U in the o ( o ' ) system will expand that vector 
in the o (o') system. 



8.33 The Body Derivative 

Let us now consider the case in which the o system is a fixed inertial system, but the 
rotation operator and the rotated system o' are both time varying. The basis vectors 
of the o' system will thus be functions of time. For i — 1, 2, 3, 

e'(0 = e, (i?) (r) = ^(r)e, (8.248) 

From the consideration of time-dependent rotations in Section 8.10, we know that 
the time derivatives of the o' system basis vectors are 

— 7 ^-- — (o(r) x e-(f) (8.249) 

dt 

where w(r) is the (generally time-dependent) angular velocity vector of the time vary- 
ing rotation lZ(t). 

Now consider the task of calculating the time-derivative of some vector V. If we 
expand this vector in the o system, then the time derivative will be 



dV 

dt 



d ^ > dVf „ 

L y ' e '- - L ■ e ' 



dt 



i = 1 



k= 1 



dt 



(8.250) 



However, if we expand Y in the o' system, the same derivative will have a more 
complicated form, due to the time variation of the unit vectors. It is 



dV 



JV' , , 






i=i 



1=1 



dVf 



= + V / W(r) x 



dt dt 

/ 1=1 ' 

(8.251) 

Collecting terms and using the linearity of cross products to factor w(f) out of the 
second term on the right gives 



dV /dV\ ^ 

— = ( — ) +a a(t) x V 



dt 



dt 



(8.252) 



where the first term on the right is the so-called body derivative, a vector defined as 




(8.253) 



To understand what the body derivative is, imagine an observer rotating with the 
o' system who is unaware that it and he are rotating. (We on the surface of the earth 
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are good examples.) If he is asked to calculate the time derivative of a vector, he will 
first express that vector in his o' reference system, and then calculate eqn (8.253), the 
body derivative. He thinks he is using eqn (8.250), but that is his error since his o' 
reference system is not, in fact, inertial. After he calculates the body derivative, we 
can correct his error by adding the term «(f) x Y. 

So the recipe for getting the body derivative is: (1) Express the vector in the o' 
system, and then (2) take the time derivative as if the e k basis vectors were constants. 
Note that, although this body derivative is calculated in a special way, nonetheless it 
is just an ordinary vector that can be expanded, if needed, in any coordinate system. 

8.34 Passive Rotations and Rigid Bodies 

We can identify the moving coordinate system e- of passive rotations introduced in 
Section 8.30 with the similarly denoted coordinate system embedded in the moving 
rigid body in Section 8.9. The position and orientation of the rigid body at time t can 
be thought of as the position and orientation of this e- system of coordinates, whose 
origin is at the center of mass of the body and whose orientation is given by 

e'(f) = TZ(t) e, (8.254) 

derived from eqn (8.233). In this system of coordinates, the vectors p„ can be ex- 
pressed as 

P„(0 = p' ni (t)e\(t) + p' n 2 (t)e 2 (t) + p' n 3 (t)^(t) (8.255) 

where the components were shown in Section 8.8 to obey 

Pni (0 = e' (t) • P„ (t) = e; (0) • P„ CO) = p' m (0) (8.256) 

and hence not vary with time. 

Thus the time derivative of p „(f), when expanded in terms of the body derivative 
and its correction becomes 

dPn(t) = 
dt 

where 

dp n ( t ) 
dt 

since eqn (8.256) implies that dp' ni (t)/dt = 0. Thus 

p (0 = = w(0 x P„ (0 (8.259) 

dt 

which reproduces eqn (8.93). 

The coordinate system embedded in the rigid body, with unit vectors given by 
eqn (8.254) and origin at the center of mass of the body, will be used frequently in 
subsequent chapters. The constancy of the components p' At) — p’ (0) in that system 
will lead to many simplifications. 



= J2 dPn ! {t) e' - 0 



dt 



(8.258) 



dp„(t) 

dt 



w(0 x p„(0 



(8.257) 
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8.35 Passive Use of Euler Angles 

The time-dependent passive rotation in eqn (8.254) can be parameterized using time- 
dependent Euler angles, as developed in Section 8.26. Rather than simply using the 
standard definition in eqn (8.198), however, it is clearer to introduce the alternate 
form of the Euler angle operators in eqn (8.208) to write 

e;.(?) = ft[a(f),my(0]ei = (0^(01 ^1)8(0 y(01^[«(0 £3] e,- ( 8 . 260 ) 

where the definition e^ R \t) — £3 (r) has been used, and where 

%(t) = n[my(t)]e3 = K[a(t),m,Y(t)]e3 and y(t) = ^[«(f)e 3 ]e 2 (8.261) 




Fig. 8.9. Steps in the passive use of Euler angles. First a rotation by a about the $3 axis leads 
to e" . Then (center figure) a rotation by f) about the e" axis leads to e"' . Finally, a rotation 
by y about the axis leads to the final orientation . 



Then the progression from e,- to e- (t ) can be decomposed into three easily visu- 
alized steps. First, the original triad is rotated by angle a(t) about the £3 axis by the 
operator lZ[a(t) £3] to produce a triad that will be denoted e-(t). Thus 

e'(t) = TZ[a(t) e 3 ] e, (8.262) 

for i — 1, 2, 3. The unit vector denoted y (t) in eqn (8.261) is seen to be the same as 
the vector e^t) produced by this first rotation. It is the rotated y-axis. Note also that 
rotation about the --axis does not change the z-axis, so e” = £3. 

The triad e'(t) is then rotated by angle about its own e^fO-axis by the second 
rotation operator to act, lZ[/3(t)y(t)] = Call the resulting triad e"{t). 

Thus, for i — 1, 2, 3, 

e"'(f) = nm) e”(t)] e"(f) (8.263) 

The new z-axis after this second rotation is which is in fact identical to the 

final z-axis (/). Also, since the y-axis is unchanged by a rotation about the y-axis, 
e'"(t) = e"(t) = y(r). 
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In the final step, the triad e''(t) is rotated by angle y(t) about its own e^ff)- 
axis by the operator lZ[y (?) e 3 (f)] = lZ[y(t) e^ff)] to produce the final triad e-(f). For 
1=1, 2, 3, 

e-(f) = TL[y(t) e 3 '(f)] e"(t) (8.264) 

Note that rotation about the z-axis doesn’t change the z-axis, and so e 3 (f) = as 

was mentioned previously. 

Thus a three-step process applied to the triad e,-, consisting of rotation about the 
original z-axis £3 by a, rotation about the new v-axis y (r) by f), and rotation about the 
even newer z-axis e 3 (f) by y, has led to the final triad e-(t). 



h 




Fig . 8.10. Another view of the passive use of Euler angles. Not all unit vectors are shown, and 
the final rotation by y is not shown. Note that e 3 lies in the e 3 -ej plane and has spherical 
polar angles a, fi regardless of the value of y . 

The final z-axis e 3 (f) will have spherical polar coordinates 

1,%, <py where Oy = fi and cpy — a (8.265) 

Hence, using either the definitions of spherical polar coordinates in Section A.8, or 
applying the matrix in eqn (8.201) to the column vector (0, 0, 1) T to obtain the com- 
ponents, the vector e 3 (t) can be expressed in the e, coordinate system as 

e 3 (f) = sin/1 cosa ei + sin pi sin a $2 + cos /1 63 (8.266) 

Since parameterization of the body system by the three Euler angles will be used 
extensively in the following chapters, it will be useful to express the angular velocity 
vector in the body system. The general angular velocity is given by eqn (8.225) as 

a) (f) = a 63 + P y + y e 3 R) = a £3 + /i e 2 "(f) + y e 3 (f) (8.267) 

where the last expression has been converted to the notation of the present section. 
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The expansion in the e-(r) system is 

ti>(t) = to[(t)e[(t) + w' 2 (t)e 2 (t) + co' 3 (t)e 3 (t) (8.268) 

where, for i — 1,2,3, 

(o'iit) = e-(?) • «(t) = e- (r) • (de 3 + + ye 3 (t)) 

= <xR 3 i[a(t), m, Y(t)] + PR'vlYit) e 3 "] + ySi 3 (8.269) 

Noting that R'J-ly (t)e 3 ] = A 2 ,[y(?)e 3 ], the matrices in eqn (8.200) through eqn (8.201) 
may be used to evaluate the needed matrix elements, giving finally 

co\ (t) = — a sin P cos y + ji sin y (8.270) 

(o' 2 (t) = a sin yd siny + /3 cos y (8.271) 

co 3 (t) = a cos yd + y (8.272) 

and the Euler angles and their derivatives are all functions of time. These equations 
could also be derived, or checked as in Exercise 8.5, by applying eqn (8.240) directly 
to the inertial components of c o(t) in eqns (8.230 - 8.232). 

8.36 Exercises 

Exercise 8.1 Consider a rotation ^[chn] with <J> = 30° and a fixed axis n that lies in the first 
octant and makes the same angle with each of the coordinate axes. 

(a) Find numerical values for all nine components of the matrix R [«J>n], [Note: It is much 
better to write the matrix elements in exact forms like, e.g., V3/2, rather than in terms of 
decimals.] 

(b) Verify numerically that your matrix is a proper, orthogonal matrix. 

(c) Check numerically, that Tr7^[<J>n] = 1 + 2 cos <l> for your matrix, as is required by eqn 
(8.149). 

(d) Check numerically that R [On][«] = [n] where [n] is the column vector of components 
of n. Why is this equation true? 

Exercise 8.2 Consider a plane mirror. Denote the unit vector normal to its surface and point- 
ing out into the room by ii. Let an operator AT convert a general vector V in front of the mirror 
into its reflected image V iMl — ATV behind the mirror. The matrix M of this operator was 
found in Exercise 7.3. 

(a) Consider the operator lZ[nn\ that rotates vectors by 180° about the normal to the mirror. 
Find a general expression for its matrix elements Rjj[7tn] in terms of the components n, of 
vector n. 

(b) Show that M = T R [jrn] = R [7rii] T where T is the matrix of the total inversion oper- 
ator T = —U discussed in Section 8.6. Mirror reflection is thus equivalent to total inversion 
followed or preceded by rotation by 1 80° about the normal to the surface of the mirror. 

(c) Use the result (b) to argue that AT must be an improper rotation operator. 
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Exercise 8.3 Suppose that a rotation operator is defined by the Euler angles 

a — 45° P = 30° y = -45° (8.273) 

(a) Write the numerical values of all nine matrix elements and the matrix R [a, p, y]. [Note: 
It is much better to use exact forms such as, e.g., V2/3, rather than decimals. Please do it that 
way.] 

(b) Use the result of Exercise 7.1 to check that your matrix is orthogonal. 

(c) By the Euler Theorem, there must be some fixed-axis rotation such that, for some <J> in 
the range 0 < <t> < n and some axis n 

R [<Dn] = R [a, p, y] (8.274) 

Find the numerical value of angle <I> by the condition that the traces of both sides of eqn 
(8.274) must be the same. 

(d) The axis vector n for this rotation has components (— 1 , 1,0) /V2. Verify that 
R [a, P, y][n] = [«]. 

(e) Denoting = 7 Z[a, p, y]e 3, verify that £3 x e^ K) — rj n where ij is a positive number. 
Why is that so? What would it mean if rj turned out to be a negative number? 

Exercise 8.4 Use the results of the Theorem 8.23.1 repeatedly to prove the second equality 
in eqn (8.208) of Theorem 8.26.2. 

Exercise 8.5 Use the passive transformation rule [a/] = R T [a, /), y] [cu] from eqn (8.240), 
and the inertial system components <7;; of « from eqns (8.230 - 8.232), to obtain the compo- 
nents a/ of (*) in the body system stated in eqns (8.270 - 8.272). 

Exercise 8.6 Use eqn (8.252) to show that the angular velocity in eqn (8. 188) can also be 
written as 

ldh\ „ ldn\ 

« = <t>n + sin O — - (l-cosO)nx — (8.275) 

\dt I b \dt I h 

Exercise 8.7 Use eqns (8.104, 8.140) to show that fixed axis rotations and infinitesimal rota- 
tions are related by 

K[d<t>h] = KrtdQh] + o(<tt>) (8.276) 

Exercise 8.8 This exercise refers to Sections 8.30 through 8.32. 

(a) The rotation matrix has been defined throughout the chapter by its expression in the 
unrotated e,- basis, Ry — e; • TZe/. Show that its expression in terms of the rotated basis, 
Rjj = e ( • IZCj, obeys R'^ = R,j and hence that 1Z has the same matrix in either basis. 

(b) From eqn (8.70), one obtains the matrix equation W = (d R / dt) R T where the matrices 
are expressed in the unrotated basis. Show that the matrix of the angular velocity operator W 
in the rotated basis is 

, T d R 

W'=R t (8.277) 

dt 

(c) In eqn (8.78) the matrix elements of the angular velocity operator W in the unrotated 
basis are written in terms of the components of the angular velocity vector « in that basis 
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as Wjj = Yll=i s ikj a> k where « = o^e*-. Show that the matrix elements of W in the 

rotated basis have the same relation to the components a/, in that basis 

3 3 

W'jj — J2 £ ‘kA where w = (8.278) 

k= 1 k= l 

[Hint: Use • e'j x e^. = Sjjk — e, • e/ x e*.- and the transformation rule eqn(8.235).] 

Exercise 8.9 In eqn (7. 160) of Exercise 7.8, the matrix representing an operator in the spher- 
ical basis was related to the standard Cartesian operator by the equation 

F (sp) =TFT t (8.279) 

(a) Apply that transformation to the matrices defined in eqn (8.200) to find R (sp )[ye 3 ], 
R <sp) [/Se 2 ], and R <sp) [ae 3 ], 

(b) Check your work by comparing your R fsp - ) [ySeo] to the matrix with components d} nm ,(P) 
as listed in the page titled “Clebsch-Gordan Coefficients, Spherical Harmonics, and d Func- 
tions” in S. Eidelman et al. (2004) “Review of Particle Physics,” Phys. Lett. B 592, 1. (That 
reference uses 6 in place of /).) 

(c) Equation (8.201) defines the matrix R [a, /l, y] representing a general rotation parameter- 
ized in terms of Euler angles. Show that the matrix R (sp) [a, /l, y] representing this rotation 
in the spherical basis can be written as 

P- y] = e iam d l mm >(PW ym ' (8.280) 

This matrix, often denoted as L)j n ,[«, /!, y], is used to represent rotations of state vectors 
of angular momentum l — 1 in quantum theory. See, for example, Chapter 12 of Shankar 
(1994). 

Exercise 8.10 In eqn (8.138) it was shown that a rotation by angle <J> about a fixed axis n can 
be written as 

7\l[<J>n] = exp ^<J>n • j'j (8.281) 

where J is the vector with operator components defined in eqn (8.135). 

(a) Prove that the commutation relations eqn (8.127) imply that, for two unit vectors n ] and 

n2, 

[(nt • J), (n 2 • J)\ = (ni x n 2 ) • J (8.282) 

Exercise 8.11 In Sections 8.14 and 8.15 we demonstrated that infinitesimal rotations com- 
mute. This result can be expressed as 

[ft[eni], 7^[en 2 ]] c = o(e) as e 0 (8.283) 

where iii and n 2 are any two unit vectors. [Recall that the symbol o(e) means that the quantity 
is of smaller order than e. See the definitions in Section D.l 1.] 
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(a) We now want to carry the calculation of the commutator to quadratic order. Use eqn 
(8.282) and 

2 2 

K[eni] = U + e (ni • + *— (ni • j) + o(s 2 ) (8.284) 

together with a similar definition for n 2 , where J is defined in eqn (8.135), to prove that 

[ft[eni], K[e n 2 ]] f = e 2 (fii x n 2 ) • J + o(s 2 ) (8.285) 

(b) If we denote by AY the change in vector V due to rotation by angle s about axis n, 
demonstrate that 

AY = \ (R) - V = (H[en] -U)\ (8.286) 

(c) Now denote the cumulative change in V resulting from successive rotations, first about 
axis n 2 and then about axis ni, as 

AV( 1 , 2 ) = (^[eni]^[en 2 ] -U)\ (8.287) 

and denote AV( 2 j) as the change produced by the same two rotations but with the order 
reversed. Prove that the difference between these two changes is 

AV( 1 , 2 ) - AV( 2 , 1 ) = [Rlenil Tl[e n 2 ]] c V (8.288) 

= (^[e 2 (n! x n 2 ) • J] - w) V + o(e 2 ) 

Thus the difference between the changes produced by pairs of rotations in opposite orders 
is, to second order in e, equal to the change produced by a rotation about an axis parallel to 
(ni x n 2 ). 

(d) Suppose that we first rotate successively about the x followed by the y axis. And then 
we start again and rotate successively about the y followed by the x axis. Show that the 
difference between the changes produced by two procedures is, to second order, equal to the 
change produced by a rotation about the z axis. 

Exercise 8.12 Complete the proof of Theorem 8.26.1. 

Exercise 8.13 Under the conditions /3 ^ 0 and (a + y) ^0 stated in Section 8.27, verify 
eqns (8.212 -8.214). 

Exercise 8.14 Find Euler angles a, /I, y for the following rotations: 

(a) Rotation by 7t/3 radians about an axis n = (ei + e 2 ) /\/2. 

(b) Rotation by tt radians about an axis n = ^(V3 — 1) ei + (>/3 + l)e 2 ) /VS- 

(c) Rotation by jt/3 radians about an axis n = (ei + e 2 + £ 3 ) /a/3. 

Exercise 8.15 Use the methods of Section 8.18 to derive the matrix R [^ 2 ] in eqn (8.200). 
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The successful description of rigid-body motion is one of the triumphs of Newtonian 
mechanics. Having learned in the previous chapter how to specify the position and 
orientation of a rigid body, we now study its natural motion under impressed external 
forces and torques. The dynamical theorems of collective motion from Chapter 1 will 
be extended by use of the rotation operators whose properties were developed in 
Chapter 8. 



9.1 Basic Facts of Rigid-Body Motion 

The center of mass R of a rigid body obeys the same formulas as those summarized 
in Section 1.15 of Chapter 1 for any collection of point masses, 

— = F (ext) where P = MV and V= — (9.1) 

dt dt 

The orbital angular momentum formulas are also the same, 

— = To ext) where L = R x P and x£, ext) = R x F (ext) (9.2) 
dt 

The spin angular momentum of a rigid body is the same as that defined in Section 
1.11. It is 

N 

S = £p„x m n p„ (9.3) 

n = 1 

and obeys the equation of motion derived in Section 1.13, 



d S 
dt 



— x 



(ext) 



where Ts 6xt) = ^ p„ x f, ( , ext) 

n=\ 



(9.4) 



where p„ = r„ — R is the relative position vector defined in eqn (1.33). 

The difference between a rigid body and a general collection of point masses is 
the relation 

P„ = w x P„ (9-5) 

that holds only for rigid bodies. This important formula for the time derivatives of 
the relative position vectors was initially stated in Section 8.12 and then re-derived 
in Section 8.34 using the concept of body derivative. 

The application of eqn (9.5) allows the formulas for S and its time derivative to 
be expressed in a very useful operator form, in which the properties of the rigid body 
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itself are contained in an operator X called the inertia operator or inertia tensor. The 
exact form of this operator depends on the details of the experimental situation being 
treated. We begin with the case of a freely moving rigid body. 

9.2 The Inertia Operator and the Spin 

Consider a rigid body moving freely in empty space, for example a tumbling asteroid. 
Applying eqn (9.5) to the definition of S in eqn (9.3) and expanding the triple vector 
product gives 

N N 

S = J2 m nP n X (w X P„) = J2 m „ {(p« • p n) w - Pn (P„ ' “) } (9-6) 

n= 1 n= 1 

Section 8.34 describes the body system of coordinates e-(f) that move with the 
rigid body. Expressed in terms of components in that system, 

s = £s;e; P n = Y.M w = E^ ^ 

i = 1 i= 1 i=l 

and eqn (9.6) becomes 



N 




3 1 




= E m » 


{p'n\ + P'nl + Pnt) w 'i 


- Pni E Pnj w j } 




n= 1 




3=1 J 




3 N 








= EE" i »|( p E^+^) 


W j 


(9.8) 



7=1 n = 1 



where co'i — l Sij 03 ) has been used. 

Introducing the definition 

= E { (fVil + <°n2 + hni) _ } (9.9) 

« = 1 

allows eqn (9.8) to be written as 

3 

s[ = (9 - 10) 

7 = 1 

The discussion of the equivalence of operators, matrices, and components in Sec- 
tion 7.8 can now be invoked to write 

S = T <cm) (o (9.11) 

where X {cm) is that operator whose matrix elements in the body system are given by 
eqn (9.9). This operator will be called the center-of-mass inertia operator. Often it is 
also called the center-of-mass inertia tensor. 

An important feature of this inertia operator is that its matrix elements in the body 
system are not time varying. 
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Lemma 9.2.1: Constancy of Matrix Elements 

The body-system matrix elements 



j (cm)' 

ij 



= e- • X (cm) e ; 



of the operator T (cm) are constants, obeying 



dl (cm) ' 

ij 

dt 



= 0 



(9.12) 



(9.13) 



Proof: In Section 8.8, and again in Section 8.34, we saw that the components p' ni of 
the relative position vectors in the body system are constants, with 



p’ nl (t) = 4(0) and hence = 0 (9.14) 

But the expression for /( cm) in eqn (9.9) contains only the components p' nj , and hence 
/ (cm) must also be constant. □ 



9.3 The Inertia Dyadic 

Like any operator equation, eqn (9.11) can also be written in dyadic form. The last 
expression in eqn (9.6) can be written as 

N 

S = J2 m " {(Pn • Pn) W “ p« (p« • “)} 

n= 1 
N 

= U P » ' P») U “ P« P/i } • w = ° (Cm) ' w (9-15) 

n= 1 

where the center-of-mass inertia dyadic is defined by 

N 

n (cm) _ Y^mn {(p„ • P„) U - P„ P„ } (9.16) 

n= 1 

This same dyadic can also be derived from the component expression eqn (9.9) us- 
ing the definition of a dyadic in terms of its matrix elements from eqn (7.52), applied 
here using basis vectors and matrix elements from the body system, 






(cm) 1 */ 

e l 



i= 1 7=1 



(9.17) 



An equivalent matrix equation can also be written. If we denote by [5'] and [&/] 
the column vectors of components of S and w in the body system, then 

[S'] = | (cm l'[cy'] 

where the matrix elements of matrix | (cm), are those given in eqn (9.9). 



(9.18) 
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9.4 Kinetic Energy of a Rigid Body 

The total kinetic energy of any collection, including a rigid body, is given in Section 
1.14 as 

1 1 N 

T — T 0 + Ti where T 0 = -MV 2 and 7) = p„ • p„ (9.19) 

n= 1 

Using eqn (9.5), this last expression may be rewritten for a rigid body as 

1 N i N ! 

Ti = -J2>n n p n • (w x p„) = - (p„ x p„) • « = -S • w (9.20) 

n= 1 «=1 

where eqn (9.3) was used. Expanding S using eqn (9.11) then gives 

Ti = = (9 ‘ 21) 

1=1 7=1 

which expresses the internal kinetic energy in terms of the angular velocity and the 
inertia operator. 

9.5 Meaning of the Inertia Operator 

The diagonal matrix elements of the matrix | (cm f are called moments of inertia. For 
example, consider the element with i = j — 3, 

/ 33 m) = X! m ' 1 { {fnl + p'nl + P'ri) 5 33 - P' n ^P'ni\ = X! ('°»1 + Pnl) (9.22) 
«=1 « = 1 

which is the sum of each mass m n multiplied by its perpendicular distance from a 
line parallel to e 3 and passing through the center of mass. 46 The other two diagonal 
elements have similar expressions in terms of perpendicular distances from the other 
coordinate axes. 

The off-diagonal elements of l (cm) are called products of inertia. For example, the 
element with i — 1 and j = 3 is 

N N 

7 (c m) ' _ J2 m n j (p' n ] + p’ n \ + p' n fj S i3 - p' nl p' n3 } = m »PnlPn3 (9.23) 

n= 1 n= 1 

The other off-diagonal elements are similar. 

45 If we imagine the body system of coordinates to have its origin at the center of mass of the body, this 
is the distance of m n from the e 3 axis. It is the moment of inertia about that axis that would be measured 
by an observer standing at the center of mass. 
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9.6 Principal Axes 

From eqn (9.9), we notice that, by construction, the inertia matrix is real and sym- 
metric, with 

7 (cm,' _ 7 < cm)' (9 . 24 ) 

As discussed in Section 7.12, any real symmetric operator in a three-dimensional 
space has three real, mutually orthogonal, and normalized eigenvectors \ ik> for k — 
1, 2, 3 obeying the standard eigenvector equation 

X'cm'v (*) = X k \ {k) (9.25) 



These three, orthonormal eigenvectors are called the principal axes of the rigid body. 
The corresponding real eigenvalues are the three solutions A.i,A. 2 ,A .3 of the cubic 
equation 



.(cm)' 

M2 



- x) 

(/«' - x) 



.(cm)' 
Ml 
j (cm)' 



7 (cm)' 
1 13 
7 (cm)' 
M3 



31 



7 (cm)' 
1 32 



- x) 



= 0 



(9.26) 



Since the matrix elements /, ( , cm) are all constant in time, the eigenvalues will also be 
constants. 

The eigenvectors may be expanded in the body system as 

3 

V w = V i <ky ^i (9-27) 



1 = 1 



where the components V- k) of the kth eigenvector are found by solving the equation 



((C'-x*) 



7 (cm)' 
M2 



7 (cm)' 
'21 
7 (cm)' 

Ml 



(C - v) 



7 (cm)' 
M3 
7 (cm)' 
M3 



\ 



7 (cm)' 
M2 



(4 cm) '-^) 



(vr\ 



vf 

\vfj 



= 0 



(9.28) 



^ ( k ) « (k) 

and applying the normalization condition to obtain unit eigenvectors with V -V = 
1. Since the eigenvalues and matrix elements are all constant in time, the components 



(k y 

Vj will also be constant. The three eigenvectors now form an orthonormal set, with 

, -W «(/) 

and Y -V = S k , (9.29) 



/. - (*) - (*) 
X (cm) Y =X k V 



for all kj — 1, 2, 3. 

Now suppose that we choose a new set of basis vectors e' equal to the eigenvectors 
just found, 

e" = V ( ° (9.30) 

for i — 1,2,3, where possibly the indices of the eigenvectors may need to inter- 
changed, or one eigenvector replaced by its negative, to make sure that the e" form 
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a right-handed set of basis vectors. Expressed in this new system, the inertia operator 
will have a matrix I (cm) defined by its matrix elements 

= e" • (l (cm, e") (9.31) 

But by eqns (9.25, 9.30), X (cm) e” = 7/ e” and hence, using the orthonormality of the 
new basis vectors, 

/< cm) " = e" • (x$) - XjSij (9.32) 

The eigenvalues will be denoted, for j = 1,2,3, by 

Xj = /j cm) " (9.33) 

and will be called the principal moments of inertia of the rigid body. Thus, in the e' 
system, the matrix corresponding to the inertia operator will be 



I (cm)" 



h[ cm) " 0 0 \ 

o / 2 (cm) " 0 

v 0 “o / 3 <cm) "/ 



with 



j (cm)" 

ij 



,(cm )"o 
- [j Sij 



(9.34) 



Warning: Change of Notation 

In subsequent work, unless explicitly stated otherwise, we will assume that any body 
system of coordinates used is already a principal axis system of the center- of-mass 
inertia operator. We assume that the task of finding principal axes, if necessary, has 
already been done. However, for notational simplicity, the double prime denoting the 
principal axis system above will be replaced by a single prime. The effect is that, 
dropping the double prime now, we will assume any body system to have a diagonal 
center-of-mass inertia matrix with 



| (cm)' 



// 1 (cm) ' 0 0 \ 
0 / 2 (cm) ' 0 
V o o / 3 (cm) '/ 



with 



r(cm)' j (cm)' x 

‘a = h S ‘J 



(9.35) 



Thus all products of inertia such as in eqn (9.23) will vanish, and the principal mo- 
ments of inertia will be given by eqn (9.9) with i—j, 



j( cm)' 



N 

J2 j (Pnl + Pn2 + 



n = 1 





(9.36) 



Use of the principal axis system leads to a considerable simplification. For exam- 
ple, eqn (9.10) becomes 



s; = EC y 



E 



UcnUN / , (cm)' / 

ij Sij co j = /, 



(9.37) 



7=1 7=1 

for each individual value i — 1, 2, 3, which says that each component of the spin is 
just the corresponding component of the angular velocity multiplied by the principal 
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moment of inertia, 47 

s \ = S ' 2 = /2 Cm), o>2 S ' 3 = / 3 (cm, '«3 (9.38) 

The expression for the internal kinetic energy in eqn(9.21) also simplifies, to a 
single sum over the squares of the angular velocity components multiplied by the 
principal moments of inertia, 



7i = -a> 



■ (I 1 ™-) -i££ ~\ii c -j£/, 

7 = 1 7 = 1 7 = 1 7 = 1 7 = 1 



(cm)' '2 

CO: 



(9.39) 



When expressed in terms of principal axis unit vectors, the center-of-mass inertia 
dyadic defined in eqn (9.17) also has a simple form. It becomes 



[(cm) _ \ " \ ' 
'-I ./=! 






(cm)' 



O ^ ^ ^ t( 



(cm)'g/ 



7 = 1 



(9.40) 



9.7 Guessing the Principal Axes 

We know that any rigid body will have a system of principal axes. If necessary, we 
can choose three arbitrary body-fixed axes, calculate the inertia matrix, and then go 
through the procedure to find the principal axis eigenvectors. But in many situations 
of interest, the directions of the principal axes can be guessed (with certainty) from 
the symmetry of the rigid body. We give here several rules that can be used. 

Lemma 9.7.1: The Plane-Figure Theorem 

If the rigid body is flat and of negligible thickness (a plane figure), then the unit vec- 
tor perpendicular to the plane will be a principal axis. Moreover, when the other two 
principal axes are found, the principal moments of inertia will obey the relation 

7 (cm)' = 7 (cm )' + 7 (cm )' (9 . 41 ) 

where we assume for definiteness that the perpendicular to the plane was chosen to be 
* ! 
e 3 . 

Proof: The proof begins by noting that all products of inertia involving the perpen- 
dicular direction will vanish. Assuming the perpendicular to be e 3 , eqn (9.9) gives, for 
i = 1,2, 

4 Cm) ' = - E w "^^3 ( 9 - 42 ) 

77=1 

But p’ n3 — 0 was assumed for all n values, hence / 1 < 3 m) = I^, 3 m) = 0 and the inertia 

47 Readers accustomed to seeing the Einstein summation convention should note that no sum over i is 
intended or implied in eqn (9.37). 
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matrix has the form 



j (cm)' .(cm)' 
Ml M2 
.(cm)' .(cm)' 
zl 1 22 



\ o o 



o N 
o 

.(cm)' 
M3 / 



(9.43) 



Thus the vector e 3 , which has components (0, 0, 1), will be an eigenvector and hence 
a principal axis, as was to be proved. 

The equality in eqn (9.41) can now be proved. With p' 3 = 0, the three principal 
moments of inertia in eqn (9.36) become 



/ (cm) = J2 m n p' n l / 2 (Cm) ' = J2 mnp n\ 7 3 Cm/ = {p'nl + p nl) 

n= 1 n = 1 n = 1 

(9.44) 

from which eqn (9.41) follows. □ 

Lemma 9.7.2: The Symmetry Rule 

Suppose there is a symmetry plane passing through the center of mass of a rigid body such 
that , for each mass rn„ on one side of the plane, there is a mirror-image mass m p — m n 
on the other side. Then the perpendicular to the symmetry plane will be a principal axis. 

Proof: The proof is similar to that of the Plane Figure Theorem. Assume e 3 chosen to 
be the perpendicular to the symmetry plane, so that the symmetry plane is the e \ -e 2 
plane. Then the sum in eqn (9.42) will vanish because each term of the form m n ab 
will be matched by a term m p a(—b) that cancels it. Thus, the inertia matrix will have 
the form shown in eqn (9.43) and so e 3 will once again be an eigenvector and hence 
a principal axis, as was to be proved. □ 



Lemma 9.7.3: Figures of Rotation 

For any figure of rotation (such as might be turned on a lathe), the symmetry axis and 
any two unit vectors perpendicular to it, and to each other, will be the principal axes. 
Also, if we assume for definiteness (and according to the usual custom) that e 3 is along 
the symmetry axis, then 

7 (cm)' = j(cmy (9 . 45 ) 

Proof: Note that any plane containing the symmetry axis of the figure will be a sym- 
metry plane of the sort described in the Symmetry Rule above. Thus any unit vector 
perpendicular to the symmetry axis will be a principal axis. Choose two perpendicu- 
lar vectors from this set. With these as two of the principal axes, the only remaining 
direction is the symmetry axis itself. Hence the symmetry axis must also be a prin- 
cipal axis, as was to be proved. Rotating the rigid body by 90° about the symmetry 
axis will move e 3 into e 2 but will not change the mass distribution. Hence eqn (9.45) 
follows. □ 



Lemma 9.7.4: Cuboids 

For any cubiod (a body with six rectangular faces), the perpendiculars to the faces will 
be principal axes. 
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Proof: This rule follows from application of the symmetry rule with planes of sym- 
metry parallel to the faces and cutting the cubiod into two equal parts. □ 

For a continuous mass distribution with a mass density D, eqn (9.36) may be 
generalized to 

/. (cm) = J dm { ( p\ 2 + pj 2 + P 3 2 ) - P,- 2 } (9.46) 

where p = r — R is the location of mass element dm = D d 3 p relative to the center of 
mass, and the integration is over the whole of the rigid body. 



9.8 Time Evolution of the Spin 

Assume now that the body axes e- are the principal axes of the center-of-mass iner- 
tia operator Z (cm) . These e- will be called the principal axis system. We continue the 
treatment of the tumbling asteroid introduced in Section 9.2 by considering the rate 
of change of its spin. 

From eqn (9.38), in the principal axis system the spin takes the form 

s = E / / cm)w & (9-47) 

1 = 1 



Its rate of change may be calculated using the body derivative introduced in Section 
8.33, 



where 




(9.48) 



(9.49) 



where col = dco'Jdt and the constancy of / ( (cm) from Lemma 9.2.1 was used. 
The equation of motion for S from eqn (9.4) then becomes 



(ext) 

T s 



— — — y^ /< 

~~ dt ~ '' 



(cm)' . //V . r, 

&> ; e. + a) x S 



1=1 



(9.50) 



Expanding this equation in terms of components in the body system gives, for i — 
1.2,3, 



1 t (cm)' • i 
r si = ! i ", 



3 3 

■££' 

j = 1 k= 1 



ijk<’’j*k 



■S', = 






3 3 

' £ £ £ ijk° ) j 
7=1 k= 1 



u 



(cm)' 



CO, 



(9.51) 



where eqn (A. 15) was used to expand the cross product and eqn (9.38) was used to 
get the second equality. 
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Equation (9.51) is often written in a slightly modified form, 






3 3 

= EE 

j = i *= i 



,(cm)' r r , i 
S ikj I k (O k (Oj + T si 



(9.52) 



When each i = 1, 2, 3 component is written out, these equations have a symmetry 
which makes them easy to remember 

CO ^ — CO^CO^ ( ^2 ^3 I i - ^sl 

7 2 (cm) «2 = co' 3 co[ (/' cm) ' - l{ cmY ) + z> 2 (9.54) 

/ 3 (cm) '«' = co[co' 2 (/ 1 (cm/ - 4 cm) ') + t' 3 (9.55) 

Each successive formula is gotten by a cyclic permutation of the integers 123 relative 
to the previous one. 

Equations (9.53 - 9.55), like many others in this subject, are called the Euler 
equations. They give a set of coupled differential equations for the components o/ of 
the angular velocity vector relative to the body system of coordinates. 



9.9 Torque-Free Motion of a Symmetric Body 

Imagine now that the tumbling asteroid is replaced by a spaceship or other object 
(such as the Earth, or a football) with rotational symmetry about some axis. Taking 
the symmetry axis to be e 3 as is conventional, it follows from Lemma 9.7.3 that such 
objects have two equal principal moments of inertia /j <cm) = 7^ cm) ^ / 3 cm) . Bodies 
with /j (cm) = 4 cm> will be referred to as symmetric rigid bodies. 

Assume further that the symmetric body is moving with Ts Sxt) = 0. The Euler 
equations of Section 9.8 can then be solved exactly for the angular velocity and spin 
as functions of time. Although the torque-free symmetric body is a simple case, the 
motion is surprisingly complicated. 

We begin by assuming 7 1 (cm) = 7^ cm) and writing the Euler equations eqns (9.53 
- 9.55) as 

7 1 (cm) 'd/ 1 = coW 3 (7{ cm) ' - 7 3 <cm) ') (9.56) 

7 1 (cm) 4' = co' 3 co[ (4 Cm) ' - 4 Cm) ') (9-57) 

7 3 (cm) 4 3 = 0 (9.58) 

It follows at once from the third equation that a> 3 is a constant equal to its value at 
time zero, &> 3 = o>\ {) . Then the other two equations can be rewritten as 



CO'-) 






(9.59) 
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where the constant is defined by 



r (cm)' 



fig = 



- /; cm) ') 



30 " 



.(cm)' 



(9.60) 



Throughout this and the following sections, we will assume for definiteness that 
the principal axis directions have been chosen so that ai' 30 > 0. Then fig > 0 when 
/{ cm) > j(cm) ^ as j la pp ens f or 0 bi a re bodies like a thin circular disk, a thin square, or 

the Earth. But fig < 0 when /{ cm) < 7j (cm> , as happens for prolate bodies like a long 
rod, a long stick of square cross section, or an American or Rugby football. 

The first of eqn (9.59) can be differentiated and the second substituted into it to 
give 

u>\ = (9.61) 

which has the general solution 



c o\ — A cos (fig t + 5) (9.62) 

where A > 0 and — n < 8 < n are constants of integration to be determined at time 
zero. The other component is then 

co 2 — = A sin (fig ? + <$) (9.63) 

flo 

The angular velocity vector is thus completely determined from its initial values. It is 
w = A {cos (fig t + 8) + sin (S2 q f + 8) eo} + <y 3 0 e 3 = An(r) + (9.64) 

where the unit vector ii(f) is defined by 

n(f) = cos + 8) ej + sin (Tlof + 8) t 2 (9.65) 

As seen by an observer in the body system, the vector n will rotate in a right-handed 

sense about the symmetry axis e 3 when /{ cm) > /{ cm) and in the opposite sense when 

r (cm)' .(cm)' 

i 3 < 

The spin angular momentum can also be written. Using eqn (9.38) and the as- 
sumed equality / 1 (cm, = l 2 cm> , it is 

S = /{ cm) (oj'| e, + a> 2 e 2 ) + / 3 (cm) a> 30 e 3 = / 1 <cm) An(r) + /{ cm) « 30 e 3 (9.66) 

It is seen that « and S appear to rotate together about the symmetry axis e 3 . Their 
components perpendicular to the symmetry axis are both parallel to the same unit 
vector n(f). The sign of A in eqn (9.62) has been chosen so that 8 — 0 will place both 
w and S in the -e 3 plane at time zero. 
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The constant magnitude of S is found from eqn (9.66) to be 

So = ||S|| = + (/ 3 (cm) '«' 0 ) 2 (9.67) 



Since we know that a torque free rigid body has dS/dt = 0, we know that S must 
be a constant vector relative to inertial space, both in magnitude and direction. We 
exploit this constancy of S by choosing the space-fixed, inertial coordinate system e* 
such that S = Soe 3 . 

Since S is an absolute constant, the time variation of the components of S in eqn 
(9.66) must be due to the motion of the unit vectors e-, and hence of the rigid body in 
which they are embedded and whose orientation they define. Note that the solution 
with A — 0 is trivial, with S, go, and e 3 all aligned and all constant in time. We will 
assume the interesting case A > 0 from now on. 

The angle 0 M y between the vector go and the symmetry axis e 3 can be determined 
from 



COS G m y 





(9.68) 



and is a constant. The angle 033 ' between the vector S and the symmetry axis can be 
similarly determined from 



COS 0 33 ' 




(9.69) 



and is also a constant. 

The assumptions that « 30 > 0 and A > 0 imply that 0 < 9 a y < tt/2 and 0 < 0 33 ' < 
jt/ 2. It follows from eqns (9.68, 9.69) that 7 3 (cm) > 7 1 (cm) , as for the Earth, implies 
that 0 33 / < 9 t0 3 '. And 7 3 cm) < /j tcm) , as for a football, implies the opposite inequality 

# 33 ' > 9 co y . 

The motion of the torque-free symmetric body can be understood by a geometric 
construction. A space-fixed right circular cone, called the space cone, is drawn with 
its symmetry axis along S and its surface defined by the path of go. A body-fixed right 
circular cone called the body cone, is drawn with its symmetry axis along e 3 and its 
surface defined by the path of go relative to the body system. The half angle of the 
body cone is thus 9 w y. These two cones are placed so that go is always along their line 
of intersection, which makes the body cone roll on the space cone without slipping. 
The body cone carries the body system of coordinates with it as it rolls, and thus 
illustrates the motion of the body. The two cases, for 7 3 (cm) > 7j (cm) and 7 3 (cm> < 7j (cm) , 
are shown in Figures 9.1 and 9.2. In cases like the Earth, the body cone encloses the 
space cone. In cases like the football, the body cone rolls on the outside of the space 
cone. 

A great deal of qualitative information can be extracted from the results of this 
section. We now estimate some magnitudes of interest. For example, the Earth has a 
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Fig. 9.1. For an oblate object like the Earth, Hq > 0. In the figure at the left, the body cone 
rolls without slipping on the space cone, carrying the axes with it. The angular velocity 
to is along the line of contact of the two cones, and the Euler angle a increases steadily. 
The figure on the right shows the motion from the viewpoint of an observer standing on 
the Earth at the north pole. The unit vector n appears to move counter-clockwise, with the 
perpendicular components of S 3 and « lined up with it. 




Fig. 9.2. For a prolate object such as a football, < 0. In the figure on the left, the Euler angle 
a increases steadily as the body cone rolls without slipping on the space cone, carrying 
the S'- axes with it. The figure on the right shows the motion from the viewpoint of an 
observer riding on the nose of the football. The unit vector n appears to move in a clockwise 
direction, with the perpendicular components of w and S 3 lined up with it. 

small positive value of the ratio (/ 3 (cm) — 7 1 <cm) )// 1 <cm, . To a body-system observer, the 
e 3 axis appears fixed and the vectors S and w (listed here in order of their angle from 
e 3 ) appear to rotate about it in a positive sense with an angular rate Qq > 0 that is 
slow compared to the total angular velocity ||«|| ^ 27 t/( 1 day). The angle $ 33 / is only 
slightly smaller than 6 m y and hence the space cone is small compared to the body 
cone, as is shown in Figure 9.1. 
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For another example, a thin rod has a ratio (/ 3 cm) — /j (cm) )//{ cm) that is negative, 
and slightly greater than — 1 . A body-system observer sees the symmetry axis e 3 as 
fixed and the vectors to and S (listed here in order of their angle from e 3 ) appear to 
rotate about it in a negative sense, with an angular rate £2q < 0 that is nearly as large 
in magnitude as the total angular velocity ||w||. The angle dyy is considerably larger 
than d,„y and hence the space cone is large compared to the body cone, as shown. 

When tidal torques are ignored, the Earth is approximately a torque-free symmet- 
ric rigid body of the sort described here. If we assume an ideal case in which it is 
perfectly rigid and torque free, we can imagine the three vectors e 3 , «, and S to be 
drawn with a common origin at the center of the Earth and their lines extended out 
through the surface of the Earth. These lines would all pierce the snow at or near the 
north pole. The vector e 3 defines the north pole, the geometric symmetry axis of the 
Earth, and would appear fixed to a polar observer standing in the snow. The trace of 
the other two vectors would appear to rotate in concentric circles around the north 
pole in a positive sense, with a common radius direction ii(f). They would make one 
complete circuit in a time To — 2n/ This time can be calculated from the known 
oblateness of the Earth. 



Tq = 



2: x 



2tt 

w 30 /^ cm) ' _ 



^(cm)' 

rT days = 306 days 

7 (cm)' t (cm)' 3 3 

r 3 1 1 



(9.70) 



The Earth’s symmetry axis apparently does have a periodic variation, called the 
Chandler wobble, that can be associated with the effect calculated here. It has a small 
amplitude: The circles in the snow mentioned above would be of the order of 5 meters 
in radius. Also, it has a period of approximately 423 days and appears to be damped. 
Because of the damping, it is not simply a relic of the Earth’s creation with some 
nonzero A value as the above analysis would suggest, but must be sustained by some 
present energy source not included in our analysis here. 



9.10 Euler Angles of the Torque-Free Motion 

The motion of the symmetric rigid body in Section 9.9 can also be described by es- 
tablishing an inertial coordinate system with its £3 axis along the fixed direction of 
the spin S, and then using Euler angles a, y to describe the orientation of the body- 
fixed e- unit vectors relative to this inertial system. The motion of the body is then 
seen from the viewpoint of an inertial observer, perhaps someone watching a football 
pass as it spirals, or someone viewing the Earth’s wobble from space. 

In the present section, we assume the results derived in Section 9.9, but re-express 
them in terms of these Euler angles. 

In terms of the Euler angles, the body symmetry axis e 3 will have spherical polar 
coordinates 1, ft, a relative to inertial axes e*, as was noted in eqn (8.265). The Euler 
angle is the angle between e 3 and e 3 and therefore must be constant here, equal to 
the constant angle 0 < O^y < 7r/2 calculated in eqn (9.69). Thus ft = /So = 6*337 . 
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Equations (8.270 - 8.272) with /i = and fi = 0 give the body system compo- 
nents of the angular velocity in terms of the Euler angles. The angular velocity is 

« = — a sin fio cos y £ \ + a sin fio sin y e 2 + (d cos /fo + Y) £3 (9.71) 

Equating the three components of this vector to the components of w in eqn (9.64) 
gives 



— d sin Pa cos y — A cos (flor + 5) (9.72) 

d sin ySo sin y — A sin (flc )t + 8) (9.73) 

d cos /S 0 + y — co' 30 (9.74) 

It follows from eqns (9.72, 9.73) that d and y are constants with d — do and y — 

yo, and that yo = — flo- Then eqn (9.74) shows that do must be a positive constant, 

a'o > 0. With these conditions established, eqns (9.72 - 9.74) together imply that 

/(cm)' ./ 

• • Woo # 

aosin/io = ^ aocos/So = y = —£2ot — 8 + 7T a = aot + K (9.75) 

t (cm) 

'i 

where the constant k is determined from the initial conditions. 

We now can express the vectors S, e 3 , and w in the inertial system relative to which 
the Euler angles are defined. These expressions will be consistent with the geometrical 
constructions in Figures 9.1 and 9.2, and will show them from the inertial viewpoint. 

The magnitude of the spin S may be calculated in terms of Euler angles using eqns 
(9.67, 9.75). It is 

So = ||S|| = /j (cm) 'd 0 (9.76) 

Since the spin vector S is constant for a torque-free body, and since the $3 axis of the 
inertial system is defined to be along the direction of this vector, the spin vector will 
at all times be equal to £3 times its magnitude, or 

S = / 1 (cm) 'd 0 e 3 (9.77) 

The body symmetry vector e 3 can be found from eqn (8.266). It is 

S 3 = sin fio q(0 + cos fa £3 (9.78) 

where 

q ( f ) = cos(dof + k) ej + sin(do? + k) £2 (9.79) 

is a unit vector that rotates about S 3 in the positive sense. The expression for « in 
terms of inertial system unit vectors will be derived in Exercise 9.8. 

To an inertial system observer viewing the Earth, the spin S is constant and along 
the S 3 axis. The symmetry axis of the body £3 is at angle j J >o from the S 3 axis, and 
rotates about it in a positive sense, with a rate do > 0 that is slightly larger than ||to||. 
The third Euler angle y represents a rotation about the symmetry axis of the body 
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that is combined with the rotation already provided by a. It moves in a retrograde 
sense, with yo < 0 but small compared to || oj || . The vectors « and e 3 lie on opposite 
sides of e 3 and are both in the plane defined by e 3 and q(t). 

To an inertial observer viewing the long rod or the football, the spin S is constant 
and along the e 3 axis. The symmetry axis of the body e 3 is at angle fto from the e 3 axis, 
and rotates about it in a positive sense, with a rate do > 0 that is much smaller than 
||(o||. The third Euler angle y moves in a positive sense, with yo > 0 and a magnitude 
only slightly smaller than ||co||. The vectors w and e 3 are on the same side of e 3 and 
are both in the plane defined by e 3 and q(r). 



9.11 Body with One Point Fixed 

In Section 9.8, we considered a rigid body moving freely in empty space, like a tum- 
bling asteroid. The motion of its center of mass was therefore governed by the same 
laws of motion as for any collection of masses, rigid or not. 

We now consider another class of interesting cases, ones in which the rigid body 
is not floating freely but has one of its points constrained to be fixed. Examples are a 
top spinning with its point set into a depression that holds it fixed, a gyroscope with 
a point along its symmetry axis held fixed, etc. 

Suppose that a point P of a rigid body is constrained to be at rest. Place an inertial 
coordinate system with its origin at that fixed point. Then the motion of the rigid body 
can be derived from the time evolution of the total angular momentum J relative to 
this inertial system, as given in Axiom 1.5.1, 



_ _ _(ext) 
dt 



N 

where J = ^ r„ x m n \ n — L + S 

n = 1 



(9.80) 



where L and S are defined in Section 1.11, and T (ext> is the total external torque 
relative to the origin of coordinates as defined in Section 1.5. 

We want to find an operator X that maps the angular velocity w into J, similar to 
the operator T (cm) defined for the spin in Section 9.2. Since an operator expression for 
the spin, S = T (cm) a), has already been derived in Section 9.2, an obvious approach is 
to find an operator expression for the orbital angular momentum L and then to use 
J = L + S to find J. 

From eqns (9.1, 9.2), 

L = R x MV (9.81) 



where V = dR/dt is the velocity of the center of mass R, and M is the total mass of 
the body. This is the same definition as for the tumbling asteroid, or for any collection 
of point masses. But, when one point of the body is fixed at the origin of coordinates, 
both ends of R are now fixed relative to the rigid body, and so vector R must move 
with the body. 

To derive an expression for L, let us begin by supposing that we have already 
found the principal axes of the rigid body relative to its center of mass, as discussed 
in Section 9.6. Then there is already a body-fixed coordinate system with principal 
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axis unit vectors e- and its origin at the center of mass. The vector R can be expanded 
in that body system as 

3 

R = E R '& where R'i = e' R (9.82) 

i=l 



Since both the unit vectors e) and R are now embedded in the same rigid body, 
the components R- will all be constants with dR(/dt = 0. Hence the body derivative 
of R defined in Section 8.33 vanishes, 




(9.83) 



and the total time derivative reduces to 



d R / dR 
dt \ dt 



uxR = (oxR 



(9.84) 



The orbital angular momentum L then becomes 



L = R x MV = MR x ( w x R) = M R 2 m - R (R w) 



(9.85) 



Writing this equation out in terms of components in the body system gives 



L[ = M 



(E + R 2 2 + R 2 ) oi'i ~ R'i E RjUj - E J !j mb) ' M 'j (9.86) 



7=1 



7 = 1 



where the orbital inertia matrix in the body system is defined by 

/f b) ' = M J (r [ 2 + R 2 2 + Rf) Sij - R'i R'j } (9.87) 

Corresponding to the last expression in eqn (9.86), there is an operator equation 

L = I (orb) w (9.88) 

where 27 0rbl is the operator whose matrix elements in the body system are given 
by eqn (9.86). Notice that, since the R' are constants, as was discussed above, the 
matrix elements / (orb) will also be constant in time and have zero time derivatives. 
The operator equation in eqn (9.88) may also be written as the equivalent dyadic 
equation 

L = D (orb) • w where D (orb) = M (r 2 D - Rr) (9.89) 

With the orbital angular momentum L now determined, the total angular momen- 
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turn J may be written as 

J = L + S = T (orb) w + J <cm) w = (T (orb) + T (cm) ) (9.90) 

where the total inertia operator is defined by 

J = X (orb) + T (cm) (9.91) 

In terms of components in the body system, we have 

3 

J ! = T, I !A ( 9 - 92 ) 

7=1 

for i — 1, 2, 3, where 

I[j = M \(r\ 2 + R 2 2 + R 2 ) Stj - R\ R'j } + (9.93) 

The delta function appears in the last term because the body system is assumed to 

be a principal axis system for the center of mass momentum operator 1 {cm K If the 

body system is not the center of mass principal axis system, then this term will be 
replaced by the non-diagonal matrix I-f m} . Equation(9.93) will be referred to as the 

‘-J 

translation of pivot theorem since it expresses the inertia tensor about a fixed point 
displaced from the center of mass. 

The dyadic equivalent to operator T can also be written. It is 

D = 0 (orb) + n (cm) = M (tf 2 U - Rr) + n (cm) (9.94) 

where D (cm) is the dyadic expressed in the center of mass principal axis system by eqn 
(9.40). 

In eqn (9.20), the internal kinetic energy 7) was given in terms of the spin and 
the angular velocity. The same can be done for the orbital kinetic energy T 0 for rigid 
bodies moving with one point fixed. Starting with the definition in eqn (9.19), and 
using eqn (9.84), 

1,1 1 1 1 
T 0 = - MV 2 = -MV • Y = -MV • w x R = -R x MV • w = -w • L (9.95) 
2 2 2 2 2 

Combining this result with eqn (9.20) then gives 

T = T 0 + T[ = -« • L + -« • S = -« • J = -« • (Iw) (9.96) 

where eqn (9.90) was used. Expanding the last expression on the right in eqn (9.96) 
in the body basis gives 

Z 1 = 1 7 = 1 

which is the same as eqn (9.21) for the internal kinetic energy 7), but with the center- 
of-mass inertia matrix 7 (cml now replaced by the total inertia matrix If. 
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9.12 Preserving the Principal Axes 

An unfortunate feature of eqn (9.93) is that, although the matrix I is diagonal by 
the assumption that the center of mass principal axis system is being used, the matrix 
I ' may not be. Moving the reference point from the center of mass to fixed point P 
may introduce non-diagonal terms. If so, then the whole calculation of the principal 
axes will have to be done again. 

However, there is a class of important special cases in which the center of mass 
principal axes are preserved. If the vector R from the fixed point to the center of mass 
happens to lie along one of the e- directions, then all products of inertia in eqn (9.87) 
will vanish. For example, suppose that R = Re\. Then R\ — R' 2 = 0 and R'^ — R with 
the result that 

I'ij - MR 2 ( Sij - 8 i3 8 j3 ) + lj cmY 8ij (9.98) 

Thus i ^ j implies that IR — 0 and the diagonal elements become 

= MR 2 + l[ cm) ' I' 22 = MR 2 + / 2 (cm) ' 4 3 = / 3 (cm) ' (9.99) 

In general, when R is along one of the center of mass principal axis e k the principal 
axes of the problem are unchanged, the principal moments of inertia along the axes 
perpendicular to e k have MR 2 added to them, and the principal moment of inertia 
along e k itself is unchanged. 

Assume now that we have preserved the center of mass principal axes, or other- 
wise found principal axes that make the total inertia operator X diagonal, and are 
now using a principal axis system of the total inertia operator. The formulas for the 
total angular momentum and the total kinetic energy then become simpler, just as 
the formulas for the spin angular momentum and the internal kinetic energy did in 
Section 9.6. 

The relation 

J = Iu (9.100) 

is expressed in component form in eqn (9.92). If the body system is the principal axis 
system for T then, for i — 1, 2, 3, 

3 

iR — I-Sjj and hence J[ — ^ Ij8ij°>j = 4 <4 (9.101) 

l=i 

Just as in eqn (9.37), we emphasize that there is no sum implied in this last equation. 
Each component of J- is just the corresponding component of w- multiplied by the 
principal moment of inertia /', 

J[ = l[(o[ 4 = l Wi J 3 = 4«3 (9.102) 



In this same principal axis system, the total kinetic energy in eqn (9.97) simplifies to 
a single sum 



T 



■ 2 . 2 - 

1 = 1 y=l 



i'jSijw'iw'j 




i=i 



(9.103) 
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9.13 Time Evolution with One Point Fixed 

The time evolution of the spin was calculated in Section 9.8. The same methods used 
there for the spin equation of motion dS/dt = T ( v exri can also be used for the total 
angular momentum equation of motion dj/dt = t (ext) . Just replace S by J, and r v by 
r, and / lcm) by I throughout. 

Assuming that we are now using body axes that are principal axes for the total 
inertia operator T, the Euler equations analogous to eqns (9.53 - 9.55) are 

l[co\ = co^co'^ (I 2 — I'?) + r[ 

^ 2*4 = (^3 — h ) + t 2 

^3^3 = aJ x aJ 2 (f| — I 2 ) + t 3 

where the torque components are defined by r' = e- • x (ext) . 

9.14 Body with One Point Fixed, Alternate Derivation 

An operator expression for the total angular momentum J of a rigid body moving 
with one point fixed can also be derived directly, without reference to L and S. The 
operator T obtained will be the same as that derived in Section 9.11. 

The basic definition of the total angular momentum of any collection, including a 
rigid body, is 

N 

J = ^ r„ x m„v„ (9.107) 

n= 1 

where v„ = dr n /dt and r„ is the vector from the fixed point P (which is taken as the 
origin of the inertial coordinate system) to the mass m„ . 

Assume that some body-fixed system of coordinates has been defined. Since all 
of the vectors r„ connect two points of the same rigid body, their components in this 
body system of coordinates must be constants, just as the components of R were in 
Section 9.11. Thus the body derivatives vanish, ( dr n /dt) b — 0, and 

v„ = dr n /dt = (o x r„ (9.108) 



(9.104) 

(9.105) 

(9.106) 



Hence 



N 

J — Ym n r n x (co x r„) (9.109) 

n = 1 



Now using the same pattern found in Section 9.2, but with p„ replaced everywhere 
by r„, the components of J in the body fixed system of coordinates can be reduced to 



i=i 



CO; 



(9.110) 
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where 



V ij = X/"" { ( r nl + r >'2 + r 'nt) 



m nj 



n= 1 



(9.111) 



Thus 



J = Iw (9.112) 

where operator X is the operator whose matrix elements in the body system are /', . 
The dyadic form of X is 



N 

D = ^ ~^m n { (r „ • r„) U - r n r n } (9.113) 

n = 1 

The matrix I' defined in eqn (9.111) can be diagonalized to find a principal axis 
system for the total inertia tensor X. The result will be the same (except for possible 
degeneracy of eigenvectors) as that obtained by the more indirect route taken in 
Sections 9.11 and 9.12. 



9.15 Work-Energy Theorems 

In Section 1.16 we showed that the rate of change of the total kinetic energy T of a 
rigid body can be written as 

J = Er t} -v„ (9.114) 

n= 1 

Using eqn (9.108), in the case of a rigid body with one point fixed at the origin of an 
inertial coordinate system this result can be written as 




where the definitions in eqns (1.17, 1.18) have been used. Thus an external torque 
that is always perpendicular to the angular velocity vector will do no work, and will 
not change the total kinetic energy of the rigid body. 

A similar result holds for the internal kinetic energy 7) and the torque Ts ext) defined 
in eqn (1.49). Starting again from the result in Section 1.16, 



dTi 

dt 



N 



EP • p, 



(9.116) 



the rate of change of the internal kinetic energy can be written using eqn (9.5) as 



dTi 

dt 



N 



Er'-wxp, 





• CO = T 



(ext) 



CO 



(9.117) 
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9.16 Rotation with a Fixed Axis 

There is a class of problems in which the rigid body is constrained even more severely 
than simply by having one point fixed. It might be constrained to rotate about a fixed 
axis, as on a lathe. Examples of this sort are often used in elementary textbooks to 
introduce students to “rotary motion.” However, it is instructive to see precisely how 
these elementary results fit into the general theory being presented here. 

Imagine that a fixed axis passes through the rigid body and is rigidly connected to 
it. If the angle of rotation about the fixed axis is denoted $, then eqn (8.197) gives at 
once that 

(o = cj>n or in component form for i — 1, 2, 3 (9.118) 

where n is a constant unit vector, along the fixed axis and pointing in the direction 
related to the positive direction of $ by a right-hand rule. Taking the origin of an 
inertial coordinate system to be some point on the fixed axis, eqn (9.112) then gives 
the angular momentum as 

J = = chin (9.119) 

Note that I is an operator and that in general J will not point in the same direction 
as a). 

If, for example, a lathe is running and constraining 6 to have a given value, or 
if «f> is otherwise known as a function of time, then g) is known. The torques acting 
on the rigid body can be calculated by putting the known components of g) from eqn 
(9.118) into the Euler equations, eqns (9.104 - 9.106), and solving for the torque 
components. 

However, there is another class of problem in which the motor of the lathe is 
assumed to be disconnected, so that the rigid body moves freely about the fixed axis. 
An example might be a rear wheel of a front-wheel-drive automobile. The angle <t> 
then becomes a free dynamical variable. A differential equation for that variable can 
be derived that depends only on the component of torque parallel to the fixed axis n. 
Dotting n from the left onto both sides of eqn (9.80) gives 

— (n-j) = n — =h--t (ext) (9.120) 

dt dt 

where the constancy of n allows it to be taken inside the time derivative. Then, intro- 
ducing eqn (9.119) gives 

— (n-4>Jn) =n-x (ext) or 7„<E> = r„ (9.121) 

dt 

where the definitions 



r„ = n • t 



{ext) 



have been introduced. 



/„ = n • (Jii) and 



(9.122) 
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The 7„ will be shown below to be the moment of inertia about the fixed rotation 
axis. It will be shown to be a constant, as has already been assumed in deriving eqn 
(9.121). The torque term x n is the component of the external torque parallel to the 
axis. In elementary textbooks, the expression r n = I n <t> from eqn (9.121) is sometimes 
referred to as the, “F equals MA of rotary motion.” 

The quantity 7„ is easiest to understand if the dyadic expression in eqn (9.103) is 
used, 

/„ = n ■ (In) = n • D • n 

N 

= Y2 m » {( r » ' r «)n- u ' (n- r») (r„ • n)} 

n= 1 
N 

= Y2 m » {(r„ • r„) - (n ■ r„) (r„ • n)} (9.123) 

n= 1 

If we decompose each r„ into a vector r„u = n (n • r„) parallel to n and a vector r„j_ 
perpendicular to n, in the manner described in Section A. 2, the expression in the last 
of eqn (9.123) reduces to 

N 

I n — Y2 m " H r »-Lll 2 (9.124) 

n= 1 

This expression is the sum of each mass multiplied by its perpendicular distance from 
the rotation axis, which is the definition of the moment of inertia about that axis. 
Since both the origin of coordinates on the axis, and the masses m n are embedded 
rigidly in the same rigid body, all dot products in eqn (9.123) will be constants. Hence 
/„ is constant, as was asserted above. 

In the special case that the fixed axis happens to pass through the center of mass of 
the rigid body, the above analysis will still hold, but with J, I, D, I, r„, x (ext) replaced 
by S,I (cm) , D (cm \ 7 (cm) , p„, Ts 6xt> , respectively. 

Returning to the general case, there is an interesting relation between /„ and 
7,S cm) , called the parallel axis theorem. Using the translation of pivot theorem from 
eqn (9.94), 

In = n • D • n = M {t? 2 - (n • R) 2 J + n • D (cm) • n = MR\ + 7,< cm) (9.125) 

where R = R|[ + Rj_ decomposes R into vectors parallel and perpendicular to n. 
The moment of inertia 7„ about an axis n is equal to the moment of inertia 7,) cm) 
about a parallel axis passing through the center of mass, plus the total mass times the 
perpendicular distance between the two axes. 

9.17 The Symmetric Top with One Point Fixed 

Although the Euler equations eqns (9.104 - 9.106) are correct, they are less useful 
than they might be because the variables ox in them are not good generalized coor- 
dinates in the Lagrangian sense described in Chapter 2. They are not even the time 
derivatives of good generalized coordinates. 
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However, a Lagrangian theory of rigid-body motion is possible, since, when used 
to specify the orientation of a body-fixed system of unit vectors, the Euler angles a, 
P, y are good generalized coordinates. To demonstrate this from the Jacobian deter- 
minant condition of eqn (2.27) would be a daunting task indeed, since a rigid body 
with one point fixed has some 10 25 degrees of freedom and (10 25 — 3) independent 
constraints. The “goodness” of the Euler angles must be established by going back to 
the property behind the Jacobian determinant condition: bi-uniqueness. The general- 
ized coordinates are “good” if they, together with the constraints, uniquely determine 
the Cartesian coordinates of each of the point masses, and if conversely the Carte- 
sian coordinates of all the point masses determine them uniquely. This bi-uniqueness 
condition is satisfied by the Euler angles. 

Thus a reduced Lagrangian may be written using the usual formula L = T — U 
where, after the constraints are applied, the (highly) reduced Lagrangian is 

L(a, — T(a, ft, y, a, ft, y) — U(a, ft, y) (9.126) 

We wish to apply these Lagrangian methods to the motion of a symmetric top moving 
with a point on its symmetry axis fixed. 




Fig. 9.3. A symmetric top spins with one point fixed, at the origin of an inertial coordinate 
system e,- . The Euler angle y represents the spin of the top about its symmetry axis $ 3 . 

We assume that the symmetric top is a body of revolution in the sense of Lemma 
9.7.3 so that its symmetry axis, taken conventionally to be e 3 , and any two axes per- 
pendicular to the symmetry axis, are principal axes of the center of mass inertia oper- 
ator. Assume moreover that the top moves with a fixed point that is on the symmetry 
axis. Then, according to the analysis in Section 9.12, the center of mass principal axes 
will be preserved and will also be the principal axes of the total inertia tensor. Then 
the principal moments of inertia will obey /{ = /( / /(. 

The force of gravity is assumed to be acting in a downward direction on the top. 
An inertial coordinate system with its origin at the fixed point of the top is defined 
with its ty axis upwards, so that g = —ge 3. As shown in Exercise 1.11 for a general 
collection, the potential energy of a rigid body in a uniform gravitational field is given 
by U = — Mg ■ R where R points to its center of mass. Here, the vector R is along the 
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e 3 direction with R = Re\, where R is the constant magnitude of R. Thus 

U (a, yd, y) = —M (— ge 3 ) • (^^ 3 ) = MgRe 3 • e 3 — MgR cos p (9.127) 

since, as noted in Section 8.35, the spherical polar coordinates of e 3 relative to the 
inertial system are (1, p, a). 

The kinetic energy can be obtained from eqn (9.103). Setting l[ = 1' 2 this equation 
reduces to 

T = \ £ W = \A (*>? + «2 2 ) + \l'A 2 (9-128) 

i = 1 

The angular velocity components in terms of the Euler angles and their derivatives 
are given in eqns (8.270 - 8.272). Substituting these and simplifying gives 

7= X -I[ (d 2 sin 2 yS + yS 2 ) + ^/'(dcosyS + y ) 2 (9.129) 

Thus the reduced Lagrangian is 

L(a, p, y, a, P, y , t) = -l[ sin 2 p + P 2 ^) + -I 3 (a cosyS + y) 2 — MgR cos p (9.130) 

The variables a and y are seen to be ignorable. So we deal with them first. The 
reduced Lagrange equation for y is 



d I 9L\ dL 
dt l dy ) dy 



(9.131) 



which simplifies to 



(a cos p + y) — constant 



(9.132) 



It will simplify later formulas if this constant is defined in terms of another constant 
A such that 



a cosyS + y = 




(9.133) 



Since eqn (8.272) gives a> 3 = (a cosyS + y), eqn (9.132) implies that a> 3 is a constant, 
equal to its value at time zero, co' 3 — co 3Q . The constant A in eqn (9.133) can be 
determined at time zero by the condition 



/I (d 0 cos Pq + yo) 

A — — ; 



13^30 

n 



(9.134) 



The reduced Lagrange equation for a is 




d 

dt 



dL 

dot 



(9.135) 
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which simplifies to 

I [a sin 2 p + (d cos /I + y) cos /I = constant (9.136) 

Again, this constant is defined in terms of another constant B such that 

l[a sin 2 p + 7j (d cos p + y) cos P — l[B (9.137) 

Inserting eqn (9.133) and canceling the common I\ factors, this becomes 

d sin 2 P + A cos ft = B (9.138) 

The constant B is thus determined from the conditions at time zero as 

B — do sin 2 Pq + A cos ySo (9.139) 

where A is determined from eqn (9.134). 

The variable p is not ignorable. Its Lagrange equation involves a second time 
derivative of p and will be bypassed in favor of the generalized energy theorem that 
provides a first order differential equation for the same variable. The reduced gener- 
alized energy function defined in eqn (3.82) is 



— . 9L . dL . 8L - 

H — a — -|“ p — — -\~ V — — L 
da y d/8 dy 



= -l[ ^d 2 sin 2 p + P 2 ^j + -/j (d cos P + y) 2 + MgR cos p 



(9.140) 



Since 9L( a, p, y, d, p, y, t)/dt — 0 here, eqn (3.83) shows that the reduced general- 
ized energy function is a constant equal to its value at time zero, 

^ /,' (d 2 sin 2 /S + p 2 ) + ^3 (dcos/J 4- y) 2 + Mg R cos P 

I[ (^dg sin 2 fio + (d 0 cos fi 0 + yof + MgR cos Pq (9.141) 






Due to the constancy of noted in eqn (9.132), the terms involving A' cancel. Mul- 
tiplying through by 2/7 { then gives 

d 2 sin 2 p + p 2 H j — cos P — C 



2 MgR 



where C is a constant given in terms of conditions at time zero as 

.9 2 MgR 

a 5 sin- Pq + Pq-\ -p — cos p 0 — C 

l \ 

Equation (9.138) can be solved for d as 

B — A cos p 



(9.142) 



(9.143) 



(9.144) 
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and used to eliminate a from eqn (9.142), giving 



sin - p 

This equation can now be solved for , 



{B — A cos P) 1 - 2 2MgR 

+ p H 77 — cos P = C 



n 



• 2 2 MgR (B — A cos P) 1 

° — C — — COS P — 



n 



sin 2 p 



(9.145) 



(9.146) 



and simplified further by the usual substitution u = cos p, which implies that 
(I — u 2 ) — sin 2 p and — m/V 1 — u 2 = p, and gives 



u 2 = f(u) = (c ~ (l - k 2 ) — (B — Au) 2 (9.147) 

Equation (9.147) is a differential equation for the variable u — cos p and can be 
solved by writing 



c j u rt ru flu' 

— — ±y f(u ) and hence t — / dt' = ± / —= = F(u, no) (9.148) 

dt J 0 Ju 0 V f(u') 

and then inverting function F to get u as a function of time. 

The general solution in eqn (9.148) involves elliptic integrals. For our purposes 
here, it will suffice to extract some generalities about the motion. The physical range 
of u — cos p is between —1 and +1. A plot of the cubic function f(u ) versus u will 
have / (±oo) = ±oo and /(±1) < 0. Thus f(u) must have one zero in the unphysical 
region u > 1. There must be two other zeroes u\ < U 2 in the physical region, with 
f(u ) > 0 for mi < u < M 2 since i/ 2 in eqn (9.147) cannot be negative. The points u\ 
and MT are called the turning points of the p motion. The p value will oscillate back 
and forth between these points. Note that smaller values of u correspond to the larger 
values of p and hence to lower positions of the top. Thus the top is lower at u\ than 
it is at M 2 - 

The oscillation of p is called the nutation of the top. While that nutation is in 
progress, the a variable is also changing with time. The change of a with time is 
called the precession of the top. Equation(9.144) gives 



a = 




(9.149) 



Thus the direction 48 of the a motion depends on the relative values of u and ( B/A ). 
If (B/A) > M 2 then a > 0 always (direct precession). If (B/A) < u\ then u < 0 always 
(retrograde precession). If mi < (B/A) < 112 then a < 0 during the upper part of the 
nutation cycle and a > 0 for the lower part. The precession will then be stationary 
at the value u s = (B/A). The result of this last case will be a series of loops of the 
symmetry axis, one loop per nutation cycle. 



48 We assume that the top is initially spun in a right-hand sense about axis so that A > 0. If A were 
negative, the precession directions stated here would all be reversed. 
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9.18 The Initially Clamped Symmetric Top 

There is one special case in which the analysis of Section 9.17 simplifies somewhat: 
The top whose symmetry axis is clamped at time zero so that do = A) = 0. 49 When 
the top is initially clamped, the various constants defined in Section 9.17 become 



B = A cos /Sq 



C = 



2 MgR 



cos /Sq 



and it is useful to define another constant 




to — 



I[A 2 
AMgR 



-f? / 1 3 Td/^ \ 
I[ \2MgRj 



(9.150) 



(9.151) 



which is a rough measure of the speed of the top. Except for the factor ), which 
is usually near unity, it is the ratio of the kinetic energy of the initial spin to the 
maximum range of the potential energy values. For a fast top, this parameter should 
therefore be very large. 

For an initially clamped top with these parameters, eqn (9.147) becomes the prod- 
uct of a linear and a quadratic factor 

, 2 MgR 

u 2 = /( h ) = — t— (mo - u) g(u) ( 9 . 152 ) 

'i 

where «q = cos /So and 

g(u ) = 1 — u 2 — 2 to (mo — m) ( 9 . 153 ) 

One zero of f(u) is seen from eqn (9.152) to be the initial value, u 2 — no ■ The other 
turning point is that solution to the quadratic equation g(u 1 ) = 0 that lies in the 
physical range — 1 < hi < 1. It is 



mi = to ~ \Jtl - 2iAo«o + 1 (9.154) 

Then, completing the square of the expression in the square root, the difference mq— u\ 
may be written 



mq - mi = | to - moI 



(7. + -k^_ 

y V (to - mq) 2 



to ~ no 
I to - Mol 



(9.155) 



This expression is seen to be essentially positive, which shows that uo = U 2 is the 
upper turning point. When released from its clamp, the top falls until it reaches u\ at 
which point it turns in its fi motion and returns to mq to begin another nutation cycle. 



49 The third angle y carries the spin of the top, and is not clamped. The yo is often very large, in fact. 
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The value of the precession rate a is found from eqn (9.149). For the initially 
clamped top, this becomes 



a = A 



u o — u 

1 — M 2 



(9.156) 



Thus at the upper nutation turning point ui = mo, the precession rate is zero. At the 
lower point u\, the precession rate is positive. The symmetry axis of the top therefore 
executes a series of cusps, stopping its precession at the upper turning point and 
maximizing it at the lower turning point. Equation (9.155) can be used to write the 
precession rate at u\ as 



d| 



A\fo- Mol 

1 — M 2 



(I, + 

yy (Vn - mo ) 2 



lAo - MO 
\to - M 0 | 



(9.157) 



If a fast top is assumed, eqn (9.157) can be expanded in powers of the small, dimen- 
sionless quantity i//T 1 . If terms up to and including quadratic order in this quantity 
relative to unity are retained throughout, the approximate value is 



“i 



2 m 8 r 

I3Y0 



1 - 



2 Wq 

*0 




(9.158) 



9.19 Approximate Treatment of the Symmetric Top 

In beginning textbooks, the precession of a rapidly spinning top is treated approx- 
imately. The total angular momentum is assumed to be constant in magnitude and 
directed along the symmetry axis. Thus 

(9.159) 

In our exact treatment, this approximation is equivalent to assuming y is a constant 
equal to its initial value yo, and that y yy a, /i. Then the gravitational torque is calcu- 
lated from 

i (ext) = R x (— Mge 3 ) = MgR sin £e 2 " (9.160) 

where is a unit vector lying in the ei-e 2 plane and making an angle a with the e 2 
axis. 50 It is the result of rotating the e 2 axis by angle a in the right-hand sense about 
axis e 3 . 

The elementary treatments ignore nutation and assume for all time. Ignor- 

ing nutation, one can then calculate the time derivative of J as 

sin / 6c ^2 , (9.161) 

where eqn (8.266) has been used. Equating eqns (9.160, 9.161), and canceling the 

50 In Section 8.35, the e'" unit vectors are the result of rotating the inertial system unit vectors e,- by Euler 
angles a, fi but not yet by y. They are the next to last stage of the progression from the inertial to the 
rotated system. 
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common sin /I factor then gives a constant precession rate 



. _ MgR 

a — — TV~ 

hYo 



(9.162) 



It is interesting to compare this approximate result with the values of a obtained 
in Section 9.18 for an initially clamped top in the fast- top limit. The precession rate 
«i at the lower turning point is given by eqn (9.158). It is twice the approximate 
value in eqn (9.162). However, the precession rate at the upper turning point is zero, 
d 2 = do = 0. So, in some rough sense, the elementary value might be thought of as 
an average between zero and a value twice too large. A careful treatment, however, 
would require a time average of a over the nutation cycle, and not a simple average 
of its values at the turning points. 



9.20 Inertial Forces 

We put aside the dynamics of rigid bodies now and consider the problem of rotating, 
translating coordinate systems in general. The moving system considered now may 
or may not be the body system of a rigid body. An observer sitting on or in a rigid 
body and using its e- system as his reference system (an astronaut riding on an aster- 
oid) might be an example. Or, the moving system could be defined by the walls of a 
spacecraft which is accelerating and tumbling, or by the walls of a laboratory on the 
rotating earth. 

An observer doing mechanics experiments in a laboratory that is translating and 
rotating with respect to inertial space will experience anomalies due to his non- 
inertial reference system. If the observer is unaware of the source of these anoma- 
lies, he may attribute them to forces acting on the masses in his experiments. These 
“forces” are called inertial forces, or sometimes fictitious forces. 




Fig. 9.4. A translating and rotating coordinate system o' has its origin at vector displacement 
b relative to an inertial system o. A mass is located at r relative to the inertial system and 
at s relative to the moving system. 

Let us suppose that an observer is using a reference system whose origin is lo- 
cated at b(f) relative to the origin of some inertial system, where b(f) is some general 
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function of time. And let his basis vectors e-(f) be rotating such that their position at 
time t relative to some standard position at time zero is given by a rotation operator 
IZ(t) with an associated angular velocity vector <o(r). The vector b(f) may point to the 
location of the center of mass of a rigid body as in the example of the astronaut, but 
it need not. It can be any displacement, just as lZ(t) can be any rotation. 

The position of a mass m relative to the origin of the inertial system will be denoted 
by r and the position of the same mass relative to the origin of the moving system by 
s. Thus 



r = b 



d b 

v = — + u 
dt 



d 2 r 
dt 2 



— d 2 b d 2 s 

~^li 2 + di 2 (9 ‘ 63) 



where the inertial velocity and the velocity relative to the moving origin are denoted 



v = 



dr 

dt 



d s 

u = — 
dt 



(9.164) 



Now assume that the observer is not only measuring position relative to a moving 
origin, but is also expressing his vectors relative to moving unit vectors e ■ . Thus 



3 

s = y where s[ = e- • s (9.165) 

i=\ 



and, using the body derivative developed in Section 8.33, 

d s I ds\ 

U = — — ( — ) -f («) X S — U/j -f W X s 
dt \dtj b 

where the body derivative will be denoted by u/,. It can be expanded as 




(9.166) 



(9.167) 



and is the velocity that the observer actually measures. Note that u* is not only mea- 
sured relative to a moving origin, but also is calculated as if the moving coordinate 
unit vectors were fixed. 

The next time derivative may now be taken, 



d 2 s d / ds\ 

dt 1 dt \dt ) 



d dui, da) d s 

-(Ui, + wxs) = — h— xs + cox — 

dt dt dt dt 



(9.168) 



Each of the time derivatives in the last expression on the right may also be expanded 
using body derivatives. The first one becomes 

+ w x u/, = a/, + co x u/, (9.169) 

dt I,, 



c/ll/, 

dt 



where we have denoted 




(9.170) 



This is the acceleration that the observer would compute if he were operating in 
complete ignorance of the fact that both his origin and his basis vectors are moving. 
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Also 





G) X G) = 




(9.171) 



shows that there is no difference between the inertial and body derivative of the 
angular velocity vector itself. 

Putting the results of eqns (9.166, 9.169) into eqn (9.168) then gives the second 
time derivative of s entirely in terms of body derivatives 



= a/, + G) X lift + G) x s + G) X (Uj, + G)XS) 

dt z 

— SLb + G) X S + 2g) X Ufc + G) X (G) x s) (9.172) 



Now suppose that an experiment consists of observing the motion of a mass m 
acted on by net real force f, which may be composed of contact forces, gravity, spring 
forces, electromagnetic forces, etc. Then, using eqn (9.163), Newton’s second law in 
the inertial system gives 



r d 2 r d 2 b 



d 2 s 
dt 2 



d 1 b 
'dt 2 



{a* + G) X S + 2g) X lift + G) X (GIXS)j 



(9.173) 



The mass times acceleration measured by the observer in the moving system will thus 
be 



maj = f — m — »- — otg) x s — 2;«w x — mG> x (g> x s) 
dt- 

— f _|_ jKtrans) + f(ang) + j(cor) + j(cent) 

where the inertial forces and their names are: 

d 2 b 

j(trans) _ _ m — Translation of origin force 
dt- 

f(ang) _ _ m d) x s Change of angular velocity force 

f< COT ) = —2m (a x u b Coriolis force 

f(cent) _ _ m(0 x (g> x s) Centrifugal force 



(9.174) 



(9.175) 

(9.176) 

(9.177) 

(9.178) 



Notice that all of these inertial forces are proportional to the mass m of the particle. 
This proportionality comes from the fact that these “forces” are actually correction 
terms that appear when mat is used in place of m a = m [d 2 r/dt 2 ) in Newton’s second 
law. 

A person driving a car that is rapidly accelerating forward will feel that the trans- 
lation-of-origin inertial force is pushing her backwards into the car seat. Relative to 
inertial space, what is really happening is that the back of the seat of the car is pressing 
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forwards on her with a real force, to give her the acceleration she needs to keep up 
with the accelerating car. 

A person sitting facing forward on the bench of a merry-go-round that is rapidly 
speeding up will feel that the change-of-angular-velocity force is pushing her back- 
wards into the seat. The real, inertial effect is similar to the accelerating car. 

Even when the angular acceleration is constant, a person on the merry-go-round 
will feel that a centrifugal force is pushing him outwards from the center so that he 
has to grab onto the pole to resist being thrown outwards. To see that the centrifugal 
force is outwards, use the expansion formula for triple cross products to write 

f(cent) _ mo) 2 J s _ £ (£> . s ) J — mco 2 s± (9.179) 

where s has been decomposed as s = sy + s_l, into vectors parallel and perpendicular 
to go using the method in Section A.2. What is really happening, relative to inertial 
space, is that in the absence of any forces, the rider would leave the merry-go-round 
and go off on a tangent line with a constant, straight-line velocity. His grabbing the 
pole provides the centripetal (inward) force that is required to keep him moving in a 
circle. 

The Coriolis force is more subtle. It is a velocity-dependent inertial force that 
acts only on objects that are moving relative to the moving system, and always acts 
at right angles to u/,. It can be understood by considering the merry-go-round once 
again. Suppose that a person riding on it throws a ball radially outwards relative to 
the merry-go-round system. The centrifugal inertial force will appear to accelerate the 
ball outwards, but will not change its apparent radial direction relative to the thrower. 
The Coriolis force, however, will appear to deflect the ball in a direction opposite 
to the direction of rotation of the merry-go-round. What is happening inertially is 
that the tangential velocity imparted to the ball as it is thrown is smaller than the 
tangential velocity of the region of the merry-go-round into which the ball flies. Thus 
the ball lags behind. The thrower attributes this lag to a Coriolis inertial force. 



9.21 Laboratory on the Surface of the Earth 

Assume that a coordinate system at the center of the earth with its unit vectors point- 
ing toward fixed stars is approximately an inertial system. Consider a translating and 
rotating reference system with its origin on the surface of the Earth at latitude X. For 
definiteness, assume that the moving system unit vectors are 

e, = South e 2 — East $3 = Up (9.180) 



One immediate difference between this special case and the general theory of 
Section 9.20 is that the vector b, which was there taken to be free to move in a 
general way, is now constrained to move with the Earth. Assuming the Earth to be 
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Fig. 9.5. The origin of the o' system is fixed to the surface of the Earth at north latitude A. The 
o system at the center of the Earth is assumed to be inertial. 

spherical, it is 



b = /?®e 3 (9.181) 

where R ® is the radius of the Earth. Hence 



d b de\ 

dt ® dt 



= /?© co x e 3 — w x b 



(9.182) 



The angular velocity of the Earth can be found from eqn (8.197), assuming that the 
Earth rotates about the fixed axis £3 with constant angular speed coo, 

a) = «oe 3 = coq (— coslej + sin Xe 3 ) (9.183) 



which gives 



Thus 



d b 

— = &>o^© cos 

dt 



d~ b de 9 db 

— - = cooR® cos A — — = too/?® cosAw x e 9 = w x — 
dt 1 dt ~ dt 



Combining eqns (9.182, 9.185) then gives 



crb 



= <0 x (w x b) = — coq {b — <0 (to • b) } = — to^ bj. 



(9.184) 

(9.185) 



(9.186) 



where, again using the decomposition given in Section A.2, bj_ is the component of 
b = by + bj_ that points directly away from the symmetry axis of the earth < 0 . Thus 
the translation of origin inertial force from eqn (9.175) is 



f(trans) _ wftJ 2|j ± _ cos Abj_ 



(9.187) 



where bj_ is the unit vector formed from b 1 . 
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It is useful to write the real force f as a vector sum of the gravitational force mg 
and all other forces f (NGl where the superscript stands for Not Gravitational. Then, as- 
suming that the angular velocity of the Earth is approximately constant, the observer 
in the Earth laboratory will find that 

maj = f* 1 '' 101 + m ^g + R^coq coslbj_^ — 2;«w xuj- mw x (w x s) (9.188) 

The last term, the centrifugal inertial force from the rotation of the laboratory frame, 
is usually negligible and will be dropped. The term containing the gravitational force 
and the translational inertial force can be written as mg e , where 

ge = g + R®u>l cos A.b_L (9.189) 

is called the effective gravitational acceleration. It is the vector sum of the actual 
gravitational force of attraction towards the center of the Earth and a term pointing 
outwards from the Earth’s rotation axis and due to the centripetal acceleration of the 
origin of coordinates as the Earth rotates. Then, finally, only the Coriolis inertial force 
remains, and 

ma/, = f^ NG) + mg e — 2mw x u/, (9.190) 

An approximate calculation of the figure of the earth, called the geoicL, can be 
made by initially assuming the Earth spherical as we have done and then calculating 
g e . To first order, the oceans of the Earth should have surfaces perpendicular to g e 
leading to a slight bulge at the equator, which is in fact observed. 

Notice that, in the northern hemisphere, a wind from the north with u/, = ui,e\ will 
be deflected to the west by the Coriolis force, while a wind from the south (opposite 
sign) will be deflected to the east. Similar deflections occur for east and west winds. 
This pattern is thought to be responsible for the weather pattern in which low pressure 
areas (with winds rushing in) in the northern hemisphere have winds circulating in 
a right-handed sense about the up axis. In the southern hemisphere, the direction of 
circulation is reversed. 

The Coriolis force is usually negligible in laboratories on the surface of the Earth. 
However, if one designed a space station similar to the one in the film “2001 - A Space 
Odyssey” with rotation of a large toroidal ring providing an artificial gravity from the 
translation of origin inertial force, the Coriolis deflections could be troublesome in 
ordinary life. Exercise 9.5 considers such a space station. 

9.22 Coriolis Force Calculations 

The Coriolis inertial force is often small compared to other forces, such as gravity. 
This lends itself to an iterative approach. 

A zeroth-order calculation is first done for the motion of the system with Coriolis 
forces ignored. The velocity u/, is calculated from that zeroth-order result, and applied 
in eqn (9.177) to find a zeroth-order approximation to the Coriolis force f (cor) . This 
approximate Coriolis force is then used to repeat the calculation for the motion of the 
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system, yielding a first-order approximation to the motion. This first-order approxi- 
mation is often sufficient, at least for estimates. But, if necessary, the iteration can be 
repeated to second and higher orders. 

For example, consider a projectile fired in a southerly direction from the surface 
of the Earth. With the Coriolis force ignored, the zeroth-order trajectory would be 



s = vot cos ae 



n 0 t sin a - -g e t 



(9.191) 



where Do is the muzzle velocity of the cannon and a is its angle from horizontal. The 
zeroth-order body derivative is 



U* = (^| = V 0 cos ae j + (d 0 sin a - g e t) 

which is used to write the zeroth-order Coriolis force as 

f'cor) = -2mw x u b = —2m OX) {dq sin (a + A) — g e t cos A} e( 



(9.192) 



(9.193) 



After two integrations with the zeroth-order Coriolis force included, the first-order 
trajectory is found to be 

s = dq t cosaej— fro sin (a + A) — -g e t cos A^ e^-t-frof sin a — -g e t 2 \ t 3 (9.194) 



As shown in Exercise 9.11, assuming a — 45° gives a first-order deflection at impact 
that is to the west in the whole of the northern hemisphere and in the southern 
hemisphere down to A —20°. 



9.23 The Magnetic - Coriolis Analogy 

Suppose that we have a set of particles located at radius vectors r„(f) relative to 
the origin of some inertial system of coordinates, where n — 1, . . . , N. Suppose that 
these particles have masses m n and electrical charges <y, ( , ch 1 . If an external electric field 
E(r, t) and a uniform and static external magnetic induction field B(r, t) — Bo are 
present, these particles will be acted on by the Lorentz forces 

(ch) 

f n = ^ ch) E(r„ , t ) 4- — v„ x Bo (9. 195) 

c 

where v„ is the velocity of the nth particle relative to inertial space. 

There is a striking analogy between the magnetic part of eqn (9.195) and the 
Coriolis force in a rotating coordinate system, given by eqn (9.177). Each involves the 
cross product of a uniform vector with a particle velocity. If we transform eqn (9.195) 
to a rotating coordinate system, this analogy allows us to use a suitably chosen w to 
cancel the magnetic part of the Lorentz force. 
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To exploit the analogy, consider that the motion of the particles under the force 
eqn (9.195) is now referred to a rotating system of coordinates. Assume that the re- 
sults of Section 9.20 are applied with b = 0 so that the inertial and rotating origins 
coincide, v„ = u„, and r„ = s„. The equations of motion in the rotating system are 
then 

(ch) 

>n n (a„)b = f (other) + < 7 , ( , ch) E(r„, t) + — — {(u„)* + w x r„} x B 0 - 2m n u x (u„)fc 

c 

— x (m x r„) (9.196) 

which may be written as 

m„(: a n ) b = f, ( , other) + ^ ch) E(r„, t) - ^^-B 0 + 2 x (u„) 6 

(ch) 

n„ 

Bo x (w x r„) - /«„« x (w x r„) (9.197) 

c 

where f (other) represents forces other than electromagnetic, and (u n )b = {dr n /dt) b and 
(a n )b are the same body derivatives as defined in eqns (9.167, 9.170) but now applied 
to the nth mass. 

If we assume that all of the particles have the same charge to mass ratio, 
q' l f n) /m n — x independent of n, then the term in eqn (9.197) containing the body 
velocities (u „)& can be eliminated by choosing « equal to what we will call the Lar- 
mour angular velocity 

WL = -fB 0 (9.198) 

2 c 

Then the equation of motion in the rotating system becomes 

m n (a n ) b = fj ; 0ther) + ^ ( ( ch) E(r„, t ) + m n w L x (w L x r„) (9.199) 

Note that the centrifugal inertial force does not cancel. The combination of the mag- 
netic and centrifugal forces gives an effective force that is centripetal (tending toward 
the center rather than away from it). If certain cases, this centripetal effective force 
will be small, which leads to the following result, which we state as a theorem. 

Theorem 9.23.1: Larmour Theorem 

A system of charged particles with a uniform charge to mass ratio / is placed in a uni- 
form, external magnetic field Bo. If one chooses the Larmour angular frequency as in eqn 
(9.198), and if the maximum centripetal force magnitude max,, (m n co^r n ) is negligible 
compared to other forces, then the problem can be solved by solving the related problem 

m n (a„ ) b = C her) + ^ ch) E(r „ , t ) (9.200) 

in the rotating system and then transforming the result back to the inertial system. 

In other words, the motion with the magnetic field is approximately the same as 
the motion without it, but rotated by the Larmour angular velocity. Notice that, for 
positive charges, the direction of the Larmour rotation «[ is opposite to the direction 
of B 0 . 
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9.24 Exercises 

Exercise 9.1 Consider a flat (negligible thickness), uniform piece of rigid metal of mass m , 
cut in the shape of a 45° right triangle. Its center of mass is on the symmetry line from the 
90° vertex and is one-third of the way up from the base. 

(a) Guess the principal axis directions, and calculate the three principal moments of inertia 
relative to the center of mass. 

(b) Check that your answers to part (a) obey the plane-figure theorem. 




Fig. 9.6. Illustration for Exercise 9.2. 



Exercise 9.2 Masses m \ — m 2 = m 3 — 1114 = in are located at the Cartesian coordinates 
shown. These masses are at the points of a regular tetrahedron. The four triangular faces are 
equilateral and identical. 

(a) Use vector methods to check that distance 12 equals distance 23. 

(b) Find the center of mass vector R. 

(c) Write out the four vectors p„ = r„ — R for n = 1, 2, 3, 4. 

(d) The inertia operator is I <xm) , with a matrix in the ej system defined by the matrix 
elements 

7 f m)/ = m " | (p'nl + Pn2 + Pnl) S U ~ PniPnj } (9-201) 

n= 1 

Calculate the six independent moments and products of inertia in eqn (9.201) and write the 
matrix l (cm) '. 

Exercise 9.3 A projectile is thrown vertically upward from the surface of the Earth. Its initial 
upward speed is uo- It reaches a maximum height, and then falls back to the ground. 

(a) Calculate its first-order vector Coriolis deflection. 

(b) The same projectile is now dropped from rest, its initial height being the same as the 
zeroth-order maximum height reached in part (a). Show that its first-order Coriolis deflection 
is in the opposite direction, and one-quarter as large, as that calculated in part (a). 

Exercise 9.4 A square stick of mass m has square sides a and length b. With the origin of 
body-fixed coordinates at the center of mass, the principal axes of the stick are the symmetry 
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Fig. 9.7. Illustration for Exercise 9.4. 

axes and the principal moments of inertia are 7 1 (un) = / 2 ' Cm) = m(a 2 + b 2 )/\2 and / 3 (cm) = 
ma 2 / 6. A thin, massless rod is driven through the center of the stick, making an angle 
with the stick’s long axis. It is glued to the stick rigidly. The massless rod is suspended in 
frictionless bearings that hold it vertical (along the <*3 space-fixed axis). Using an external 
motor, it is then rotated about that vertical axis with constant angular velocity <00 = 0)063. 

(a) Write an expression for the angular velocity vector of the stick expressed in the body-fixed 
system e[. 

(b) Write an expression for S, the spin angular momentum vector of the stick, expressed in 
the body-fixed system. 

(c) Write an expression for the kinetic energy of the stick. How much work per second must 
the motor provide to keep the angular velocity constant? 

(d) Write an expression for the vector torque exerted on the system by the bearings and 
motor, also expressed in the body-fixed system. 




Fig. 9.8. Illustration for Exercise 9.5. 



